WO2022193104A1 - Method for generating light field prediction model, and related apparatus - Google Patents

Method for generating light field prediction model, and related apparatus Download PDF

Info

Publication number
WO2022193104A1
WO2022193104A1 PCT/CN2021/080893 CN2021080893W WO2022193104A1 WO 2022193104 A1 WO2022193104 A1 WO 2022193104A1 CN 2021080893 W CN2021080893 W CN 2021080893W WO 2022193104 A1 WO2022193104 A1 WO 2022193104A1
Authority
WO
WIPO (PCT)
Prior art keywords
distance
cube
small
small cube
light field
Prior art date
Application number
PCT/CN2021/080893
Other languages
French (fr)
Chinese (zh)
Inventor
郑凯
韩磊
李琳
李选富
林天鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN202180095331.6A priority Critical patent/CN117015966A/en
Priority to PCT/CN2021/080893 priority patent/WO2022193104A1/en
Publication of WO2022193104A1 publication Critical patent/WO2022193104A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules

Abstract

A method for generating a light field prediction model, and a related apparatus. The method comprises: establishing a cube model surrounding a photographed scene according to the respective photographing orientations of multiple sample images, wherein the cube model comprises multiple small cubes (voxels); respectively calculating multiple truncate distances of each of the multiple small cubes according to the multiple sample images (S402); sampling a spatial point from each small cube according to the multiple truncate distances of the small cube, wherein each spatial point sampled correspondingly have spatial coordinates; and training a light field prediction model according to the spatial coordinates of the sampled spatial points, wherein the light field prediction model is used for predicting a light field of the scene. The method can improve the sampling efficiency of the voxels, thereby improving the generation efficiency of the light field prediction model.

Description

一种光场预测模型的生成方法及相关装置A method and related device for generating a light field prediction model 技术领域technical field
本申请涉及计算机技术领域,尤其涉及一种光场预测模型的生成方法及相关装置。The present application relates to the field of computer technology, and in particular, to a method for generating a light field prediction model and a related device.
背景技术Background technique
虚拟现实显示技术的前景非常广阔,从卡片盒(cardbox)、虚拟现实头戴式显示器(Vive VR)、头戴显示器(Oculus Rift),到2019年推出的虚拟现实头盔(VR Glass),虚拟显示硬件设备正在变得简单、易用、普及。然而,与虚拟显示硬件设备飞速改进不同,高质量的虚拟现实数字内容非常有限。与传统显示的二维(2D)数字内容不同,为了增强身临其境的感受(比如显示内容跟随人的运动而变化),虚拟现实内容需要场景的三维光场,捕获场景的三维光场需要非常复杂的硬件设备,限制了三维光场获取的灵活性。因此,利用计算机视觉算法来获取三维光场成为了一种新的研究方向,但是如何基于计算机视觉算法高效、准确地获取三维光场是本领域技术人员面临的技术问题。The prospect of virtual reality display technology is very broad, from card box (cardbox), virtual reality head-mounted display (Vive VR), head-mounted display (Oculus Rift), to the virtual reality helmet (VR Glass) launched in 2019, virtual display Hardware devices are becoming simple, easy to use, and ubiquitous. However, unlike the rapidly improving virtual display hardware, high-quality virtual reality digital content is very limited. Unlike traditionally displayed two-dimensional (2D) digital content, in order to enhance the immersive experience (such as the display content changing with the movement of a person), virtual reality content requires a 3D light field of the scene, and the 3D light field to capture the scene requires Very complex hardware devices limit the flexibility of 3D light field acquisition. Therefore, using computer vision algorithms to obtain 3D light fields has become a new research direction, but how to efficiently and accurately obtain 3D light fields based on computer vision algorithms is a technical problem faced by those skilled in the art.
发明内容SUMMARY OF THE INVENTION
本申请实施例公开了一种光场预测模型的生成方法及相关装置,能够提高光场预测模型的生成效率。The embodiments of the present application disclose a method for generating a light field prediction model and a related device, which can improve the generation efficiency of the light field prediction model.
第一方面,本申请实施例公开了一种光场预测模型的生成方法,该方法包括:根据多张样本图像各自的拍摄方位建立包围被拍摄场景的立方体模型,其中,所述立方体模型包括多个小立方体voxel;然后,根据所述多张样本图像分别计算所述多个小立方体中每个小立方体的多个截断距离,其中,根据第一样本图像计算得到的每个小立方体的一个截断距离包括:根据拍摄第一样本图像时相机到所述每个小立方体的距离与所述相机到所述场景中物体表面的距离确定所述截断距离,所述第一样本图像为所述多张样本图像中任意一张;之后根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,其中,采样的每个空间点对应有空间坐标;然后,根据采样的所述空间点的空间坐标训练光场预测模型,其中,所述光场预测模型用于预测所述场景的光场。In a first aspect, an embodiment of the present application discloses a method for generating a light field prediction model. The method includes: establishing a cube model surrounding a captured scene according to respective shooting orientations of multiple sample images, wherein the cube model includes multiple a plurality of small cube voxels; then, according to the plurality of sample images, respectively calculate a plurality of cut-off distances for each small cube in the plurality of small cubes, wherein one of each small cube calculated according to the first sample image The truncation distance includes: determining the truncation distance according to the distance from the camera to each small cube and the distance from the camera to the surface of the object in the scene when the first sample image is captured, the first sample image being the any one of the multiple sample images; then sample spatial points from the small cubes according to the multiple cut-off distances of each small cube, wherein each sampled spatial point corresponds to a spatial coordinate; then, according to the sampling The spatial coordinates of the spatial point train a light field prediction model, wherein the light field prediction model is used to predict the light field of the scene.
在上述方法中,训练深度学习网络(也称光场预测模型,用于预测三维光场)所用到的voxel采样点,是基于图像的深度信息获取的,具体是基于深度信息和voxel到相机的距离计算出一个截断距离,然后根据截断距离的大小进行差异化采样,一方面,这种采样方式能够快速将采样集中到重点区域,提高了采样效率;另一方面,这种采样方式采样的voxel基本都集中在物体表面附近,因此后续基于这种voxel训练出的深度学习网络在进行图像预测时,能够更好的表征物体的纹理细节信息,能够减少模糊(blur)和结构错误的现象。In the above method, the voxel sampling points used for training the deep learning network (also known as the light field prediction model, which is used to predict the three-dimensional light field) are obtained based on the depth information of the image, specifically based on the depth information and the voxel to the camera. A truncation distance is calculated from the distance, and then differentiated sampling is performed according to the size of the truncation distance. On the one hand, this sampling method can quickly concentrate the sampling to key areas and improve the sampling efficiency; on the other hand, the voxel sampled by this sampling method Basically, they are concentrated near the surface of the object. Therefore, the deep learning network trained based on this voxel can better represent the texture details of the object when performing image prediction, and can reduce the phenomenon of blur and structural errors.
结合第一方面,在第一方面的一种可选的方案中,所述根据采样的所述空间点的空间坐标训练光场预测模型之后,还包括:通过所述光场预测模型预测所述场景的光场。也即是说,模型训练设备在训练出光场预存模型后,还会通过该光场预测模型预测光场。With reference to the first aspect, in an optional solution of the first aspect, after training the light field prediction model according to the sampled spatial coordinates of the spatial point, the method further includes: predicting the light field prediction model by using the light field prediction model. The light field of the scene. That is to say, after training the pre-stored light field model, the model training device will also predict the light field through the light field prediction model.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选 的方案中,所述根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,包括:对所述每个小立方体的所述多个截断距离进行融合计算,得到所述每个小立方体融合截断距离;根据所述每个小立方体的融合截断距离从所述每个小立方体中采样空间点。该实现方式中,对每个小立方体均要计算其融合截断距离,然后根据融合截断距离进行采样。With reference to the first aspect or any of the above-mentioned possible implementation manners of the first aspect, in yet another optional solution of the first aspect, the Sampling space points in the cube includes: performing fusion calculation on the plurality of cut-off distances of each small cube to obtain the fusion cut-off distance of each small cube; The sample space points in each small cube are described. In this implementation, the fusion cutoff distance is calculated for each small cube, and then sampling is performed according to the fusion cutoff distance.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,所述根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,包括:确定所述每个小立方体中至少有一个截断距离的绝对值小于预设阈值的第一小立方体;对所述第一小立方体的所述多个截断距离进行融合计算,得到所述第一小立方体融合截断距离;根据所述第一小立方体的融合截断距离从所述第一小立方体中采样空间点。在这种方式中,并非针对所有的小立方体均计算其融合截断距离,而是仅针对截断距离的绝对值小于预设阈值的小立方体计算其融合截断距离,因为截断距离较大时,相应小立方体距离物体表面越远,后续对其采样的必要性较小,因此,本申请不对这类小立方体的截断距离进行融合计算,相当于提前将这部分小立方体排除在采样范畴之外,在基本不降低后续采样效果的情况下,降低了计算量,提高了光场预测模型的生成效率。With reference to the first aspect or any of the above-mentioned possible implementation manners of the first aspect, in yet another optional solution of the first aspect, the Sampling space points in a cube includes: determining at least one first small cube whose absolute value of cutoff distance is smaller than a preset threshold in each of the small cubes; and fusing the plurality of cutoff distances of the first small cube Calculate to obtain the fusion cut-off distance of the first small cube; and sample spatial points from the first small cube according to the fusion cut-off distance of the first small cube. In this way, the fusion cut-off distance is not calculated for all small cubes, but only for the small cubes whose absolute value of cut-off distance is less than the preset threshold, because when the cut-off distance is large, the corresponding The farther the cube is from the surface of the object, the less necessary for subsequent sampling. Therefore, this application does not perform fusion calculation on the truncation distance of such small cubes, which is equivalent to excluding these small cubes from the sampling category in advance. Without reducing the effect of subsequent sampling, the amount of calculation is reduced, and the generation efficiency of the light field prediction model is improved.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,在从所述小立方体中采样空间点的过程中,融合截断距离越小的小立方体采样的空间点越多。可以理解,融合截断距离越小的立方体距离物体表面越近,这类小立方体中的空间点相比于同camera ray上的其他小立方体中的空间点来说,能够更好地体现像素信息,因此,更多地基于这类小立方体中的空间点进行光场预测模型的训练,有利于该光场预测模型后续预测出更准确的图像。In combination with the first aspect or any of the above possible implementation manners of the first aspect, in yet another optional solution of the first aspect, in the process of sampling spatial points from the small cube, the longer the fusion cutoff distance is. Smaller cubes sample more points in space. It can be understood that the smaller the fusion truncation distance is, the closer the cube is to the surface of the object. Compared with the spatial points in other small cubes on the same camera ray, the spatial points in such small cubes can better reflect the pixel information. Therefore, training the light field prediction model based more on the spatial points in such small cubes is beneficial to the subsequent prediction of more accurate images by the light field prediction model.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,所述融合计算包括加权平均计算。可以理解,通过加权平均计算得到的融合截断距离能更准确的反映小立方体到物体表面的远近。With reference to the first aspect or any of the above possible implementation manners of the first aspect, in yet another optional solution of the first aspect, the fusion calculation includes weighted average calculation. It can be understood that the fusion cutoff distance calculated by the weighted average can more accurately reflect the distance between the small cube and the surface of the object.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,基于所述第一样本图像计算得到的第二小立方体的截断距离在加权平均计算时所占的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,所述第一夹角为所述第二小立方体所在的视场线camera ray与距离所述第二小立方体最近的物体表面的法向量之间的夹角,所述第二小立方体为所述立方体模型中的任意一个小立方体。In combination with the first aspect or any of the above-mentioned possible implementations of the first aspect, in yet another optional solution of the first aspect, the cutoff distance of the second small cube is calculated based on the first sample image. The weight value occupied in the weighted average calculation is negatively correlated with the distance from the second small cube to the camera when the first sample image was captured, and/or positively correlated with the first included angle, so The first angle is the angle between the field of view camera ray where the second small cube is located and the normal vector of the surface of the object closest to the second small cube, and the second small cube is the cube Any small cube in the model.
在这种方式中,计算融合截断距离时用到了基于每张样本图像计算得到的权重值,由于该权重值与到时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,因此当把该权重值结合进来计算融合截断距离时,能够更准确地反映不同方位对该融合截断距离的影响。In this way, the weight value calculated based on each sample image is used to calculate the fusion cutoff distance, because the weight value is negatively correlated with the distance to the camera, and/or is positive with the first included angle Therefore, when the weight value is combined to calculate the fusion cut-off distance, the influence of different orientations on the fusion cut-off distance can be more accurately reflected.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,所述根据所述多张样本图像分别计算所述立方体模型中每个小立方体的截断距离之前,还包括:在拍摄所述多张样本图像中每张样本图像时,从所述每张样本图像的拍摄视角采集深度信息,其中,该深度信息用于表征所述相机到所述场景中物体表面的距离。With reference to the first aspect or any of the above-mentioned possible implementation manners of the first aspect, in yet another optional solution of the first aspect, the calculation of each Before the cut-off distance of the small cube, the method further includes: when capturing each sample image in the plurality of sample images, collecting depth information from the shooting angle of each sample image, wherein the depth information is used to characterize the camera The distance to the surface of the object in the scene.
在这种方式中,从所述每张样本图像的拍摄视角采集深度信息,能够更准确地反映相机到被拍摄场景中的物体表面的距离,这有利于计算出更准确的截断距离。In this way, collecting depth information from the shooting angle of each sample image can more accurately reflect the distance from the camera to the surface of the object in the scene being shot, which is conducive to calculating a more accurate cutoff distance.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,在计算第二小立方体的融合截断距离的过程中,基于所述第一样本图像计算得到的第二小立方体的截断距离在加权平均计算时所占的权重值,也可以称为根据所述第一样本图像计算得到的所述第二小立方体的权重值w(p),该权重值w(p)满足如下关系:In combination with the first aspect or any of the above-mentioned possible implementations of the first aspect, in another optional solution of the first aspect, in the process of calculating the fusion cutoff distance of the second small cube, based on the first The weight value occupied by the cut-off distance of the second small cube calculated from a sample image in the weighted average calculation, which may also be called the weight value w of the second small cube calculated according to the first sample image (p), the weight value w(p) satisfies the following relationship:
w(p)=cos(θ)/distance(v)w(p)=cos(θ)/distance(v)
其中,θ为所述第一夹角,distance(v)为所述第二小立方体到拍摄所述第一样本图像时所述相机的距离。Wherein, θ is the first included angle, and distance(v) is the distance from the second small cube to the camera when the first sample image is captured.
前面提到,根据所述第一样本图像计算得到的第二小立方体的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,这里的w(p)的表达式就是这一思想的一种可选表达,因此当把该权重值结合进来计算融合截断距离时,能够更准确地反映不同方位对该融合截断距离的影响。As mentioned above, the weight value of the second small cube calculated according to the first sample image is negatively correlated with the distance from the second small cube to the camera when the first sample image was captured, and / or, it is positively correlated with the first included angle, the expression of w(p) here is an optional expression of this idea, so when the weight value is combined to calculate the fusion cutoff distance, it can reflect more accurately The effect of different orientations on the fusion cutoff distance.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,所述根据第一样本图像计算得到的所述第二小立方体的截断距离d(p)满足如下关系:In combination with the first aspect or any of the above possible implementation manners of the first aspect, in yet another optional solution of the first aspect, the calculated value of the second small cube based on the first sample image is The cutoff distance d(p) satisfies the following relationship:
d(p)=sdf(p)/|u|d(p)=sdf(p)/|u|
其中,sdf(p)为拍摄第一样本图像时相机到所述第一小立方体的距离,与所述相机到所述场景中物体表面的距离的差,u为预设阈值。Wherein, sdf(p) is the difference between the distance from the camera to the first small cube when the first sample image is captured, and the distance from the camera to the surface of the object in the scene, and u is a preset threshold.
可以理解,这里的d(p)仅为截断距离的一种可选的计算公式,实际应用中还可以有其他的表达方式。It can be understood that d(p) here is only an optional calculation formula for the cut-off distance, and other expressions may also be used in practical applications.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,若sdf(p)>|u|,则d(p)=1;若sdf(p)<0且|sdf(p)|>|u|,则d(p)=-1。With reference to the first aspect or any of the above possible implementation manners of the first aspect, in yet another optional solution of the first aspect, if sdf(p)>|u|, then d(p)=1; If sdf(p)<0 and |sdf(p)|>|u|, then d(p)=-1.
在这种方式中,将一范围内的小立方体的截断距离赋值为1,将另一范围内的小立方体的截断距离赋值为-1,这有利于后续对这两类小立方体做相同处理,从而提高计算效率。In this way, the truncation distance of the small cubes in one range is assigned as 1, and the truncation distance of the small cubes in another range is assigned as -1, which is conducive to the subsequent processing of the two types of small cubes. Thereby improving computational efficiency.
第二方面,本申请实施例提供一种光场预测模型的生成装置,该装置包括:In a second aspect, an embodiment of the present application provides an apparatus for generating a light field prediction model, the apparatus comprising:
建立单元,用于根据多张样本图像各自的拍摄方位建立包围被拍摄场景的立方体模型,其中,所述立方体模型包括多个小立方体voxel;a establishing unit, configured to establish a cube model surrounding the scene to be shot according to the respective shooting orientations of the multiple sample images, wherein the cube model includes a plurality of small cube voxels;
第一计算单元,用于根据所述多张样本图像分别计算所述多个小立方体中每个小立方体的多个截断距离,其中,根据第一样本图像计算得到的每个小立方体的一个截断距离包括:根据拍摄第一样本图像时相机到所述每个小立方体的距离与所述相机到所述场景中物体表面的距离确定所述截断距离,所述第一样本图像为所述多张样本图像中任意一张;a first calculation unit, configured to calculate a plurality of cutoff distances of each small cube in the plurality of small cubes according to the plurality of sample images, wherein one of each small cube calculated according to the first sample image The truncation distance includes: determining the truncation distance according to the distance from the camera to each small cube and the distance from the camera to the surface of the object in the scene when the first sample image is captured, the first sample image being the any one of the multiple sample images;
采样单元,用于根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,其中,采样的每个空间点对应有空间坐标;a sampling unit, configured to sample spatial points from the small cubes according to a plurality of cutoff distances of each small cube, wherein each sampled spatial point corresponds to a spatial coordinate;
第二计算单元,用于根据采样的所述空间点的空间坐标训练光场预测模型,其中,所述光场预测模型用于预测所述场景的光场。The second computing unit is configured to train a light field prediction model according to the sampled spatial coordinates of the spatial point, wherein the light field prediction model is used to predict the light field of the scene.
在上述装置中,训练深度学习网络(也称光场预测模型,用于预测三维光场)所用到的voxel采样点,是基于图像的深度信息获取的,具体是基于深度信息和voxel到相机的距离计算出一个截断距离,然后根据截断距离的大小进行差异化采样,一方面,这种采样方 式能够快速将采样集中到重点区域,提高了采样效率;另一方面,这种采样方式采样的voxel基本都集中在物体表面附近,因此后续基于这种voxel训练出的深度学习网络在进行图像预测时,能够更好的表征物体的纹理细节信息,能够减少模糊(blur)和结构错误的现象。In the above device, the voxel sampling points used for training the deep learning network (also called the light field prediction model, which is used to predict the three-dimensional light field) are obtained based on the depth information of the image, specifically based on the depth information and the voxel to the camera. A truncation distance is calculated from the distance, and then differentiated sampling is performed according to the size of the truncation distance. On the one hand, this sampling method can quickly concentrate the sampling to key areas and improve the sampling efficiency; on the other hand, the voxel sampled by this sampling method Basically, they are concentrated near the surface of the object. Therefore, the deep learning network trained based on this voxel can better represent the texture details of the object when performing image prediction, and can reduce the phenomenon of blur and structural errors.
结合第二方面,在第二方面的一种可选的方案中,所述装置还包括:In conjunction with the second aspect, in an optional solution of the second aspect, the device further includes:
预测单元,用于通过所述光场预测模型预测所述场景的光场。也即是说,模型训练设备在训练出光场预存模型后,还会通过该光场预测模型预测光场。A prediction unit, configured to predict the light field of the scene by using the light field prediction model. That is to say, after training the pre-stored light field model, the model training device will also predict the light field through the light field prediction model.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,在根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点方面,所述采样单元具体用于;对所述每个小立方体的所述多个截断距离进行融合计算,得到所述每个小立方体融合截断距离;根据所述每个小立方体的融合截断距离从所述每个小立方体中采样空间点。该实现方式中,对每个小立方体均要计算其融合截断距离,然后根据融合截断距离进行采样。In combination with the second aspect or any of the above-mentioned possible implementations of the second aspect, in yet another optional solution of the second aspect, according to the multiple cut-off distances of each small cube, from the small cube In terms of sampling space points, the sampling unit is specifically used to: perform fusion calculation on the plurality of cut-off distances of each small cube to obtain the fusion cut-off distance of each small cube; according to each small cube The fusion cut-off distance samples spatial points from each of the small cubes. In this implementation, the fusion cutoff distance is calculated for each small cube, and then sampling is performed according to the fusion cutoff distance.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,在根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点方面,所述采样单元具体用于:确定所述每个小立方体中至少有一个截断距离的绝对值小于预设阈值的第一小立方体;对所述第一小立方体的所述多个截断距离进行融合计算,得到所述第一小立方体融合截断距离;根据所述第一小立方体的融合截断距离从所述第一小立方体中采样空间点。在这种方式中,并非针对所有的小立方体均计算其融合截断距离,而是仅针对截断距离的绝对值小于预设阈值的小立方体计算其融合截断距离,因为截断距离较大时,相应小立方体距离物体表面越远,后续对其采样的必要性较小,因此,本申请不对这类小立方体的截断距离进行融合计算,相当于提前将这部分小立方体排除在采样范畴之外,在基本不降低后续采样效果的情况下,降低了计算量,提高了光场预测模型的生成效率。In combination with the second aspect or any of the above-mentioned possible implementations of the second aspect, in yet another optional solution of the second aspect, according to the multiple cut-off distances of each small cube, from the small cube In terms of sampling space points, the sampling unit is specifically configured to: determine at least one first small cube whose absolute value of the truncation distance is less than a preset threshold in each small cube; A fusion calculation is performed on a plurality of cutoff distances to obtain a fusion cutoff distance of the first small cube; spatial points are sampled from the first small cube according to the fusion cutoff distance of the first small cube. In this way, the fusion cut-off distance is not calculated for all small cubes, but only for the small cubes whose absolute value of cut-off distance is less than the preset threshold, because when the cut-off distance is large, the corresponding The farther the cube is from the surface of the object, the less necessary for subsequent sampling. Therefore, this application does not perform fusion calculation on the truncation distance of such small cubes, which is equivalent to excluding these small cubes from the sampling category in advance. Without reducing the effect of subsequent sampling, the amount of calculation is reduced, and the generation efficiency of the light field prediction model is improved.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,在从所述小立方体中采样空间点的过程中,融合截断距离越小的小立方体采样的空间点越多。可以理解,融合截断距离越小的立方体距离物体表面越近,这类小立方体中的空间点相比于同camera ray上的其他小立方体中的空间点来说,能够更好地体现像素信息,因此,更多地基于这类小立方体中的空间点进行光场预测模型的训练,有利于该光场预测模型后续预测出更准确的图像。In combination with the second aspect or any of the above-mentioned possible implementations of the second aspect, in yet another optional solution of the second aspect, in the process of sampling spatial points from the small cube, the fusion cutoff distance is longer. Smaller cubes sample more points in space. It can be understood that the smaller the fusion truncation distance is, the closer the cube is to the surface of the object. Compared with the spatial points in other small cubes on the same camera ray, the spatial points in such small cubes can better reflect the pixel information. Therefore, training the light field prediction model based more on the spatial points in such small cubes is beneficial to the subsequent prediction of more accurate images by the light field prediction model.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,所述融合计算包括加权平均计算。可以理解,通过加权平均计算得到的融合截断距离能更准确的反映小立方体到物体表面的远近。With reference to the second aspect or any of the above possible implementation manners of the second aspect, in yet another optional solution of the second aspect, the fusion calculation includes weighted average calculation. It can be understood that the fusion cutoff distance calculated by the weighted average can more accurately reflect the distance between the small cube and the surface of the object.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,基于所述第一样本图像计算得到的第二小立方体的截断距离在加权平均计算时所占的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,所述第一夹角为所述第二小立方体所在的视场线camera ray与距离所述第二小立方体最近的物体表面的法向量之间的夹角,所述第二小立方体为所述立方体模型中的任意一个小立方体。In combination with the second aspect or any of the above-mentioned possible implementations of the second aspect, in yet another optional solution of the second aspect, the cutoff distance of the second small cube is calculated based on the first sample image. The weight value occupied in the weighted average calculation is negatively correlated with the distance from the second small cube to the camera when the first sample image was captured, and/or positively correlated with the first included angle, so The first angle is the angle between the field of view camera ray where the second small cube is located and the normal vector of the surface of the object closest to the second small cube, and the second small cube is the cube Any small cube in the model.
在这种方式中,计算融合截断距离时用到了基于每张样本图像计算得到的权重值,由 于该权重值与到时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,因此当把该权重值结合进来计算融合截断距离时,能够更准确地反映不同方位对该融合截断距离的影响。In this way, the weight value calculated based on each sample image is used to calculate the fusion cutoff distance, because the weight value is negatively correlated with the distance to the camera, and/or is positive with the first included angle Therefore, when the weight value is combined to calculate the fusion cut-off distance, the influence of different orientations on the fusion cut-off distance can be more accurately reflected.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,所述装置还包括:In combination with the second aspect or any of the above-mentioned possible implementation manners of the second aspect, in yet another optional solution of the second aspect, the device further includes:
采集单元,用于在拍摄所述多张样本图像中每张样本图像时,从所述每张样本图像的拍摄视角采集深度信息,其中,所述深度信息用于表征所述相机到所述场景中物体表面的距离。an acquisition unit, configured to acquire depth information from a shooting angle of view of each sample image when shooting each sample image in the plurality of sample images, wherein the depth information is used to represent the distance from the camera to the scene distance from the surface of the object.
在这种方式中,从所述每张样本图像的拍摄视角采集深度信息,能够更准确地反映相机到被拍摄场景中的物体表面的距离,这有利于计算出更准确的截断距离。In this way, collecting depth information from the shooting angle of each sample image can more accurately reflect the distance from the camera to the surface of the object in the scene being shot, which is conducive to calculating a more accurate cutoff distance.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,在计算第二小立方体的融合截断距离的过程中,基于所述第一样本图像计算得到的截断距离在加权平均计算时所占的权重值,也可以称为根据所述第一样本图像计算得到的所述第二小立方体的权重值w(p),该权重值w(p)满足如下关系:In combination with the second aspect or any of the above-mentioned possible implementations of the second aspect, in yet another optional solution of the second aspect, in the process of calculating the fusion cutoff distance of the second small cube, based on the first The weight value of the cut-off distance calculated from a sample image in the weighted average calculation can also be called the weight value w(p) of the second small cube calculated according to the first sample image. The weight value w(p) satisfies the following relationship:
w(p)=cos(θ)/distance(v)w(p)=cos(θ)/distance(v)
其中,θ为所述第一夹角,distance(v)为所述第二小立方体到拍摄所述第一样本图像时所述相机的距离。Wherein, θ is the first included angle, and distance(v) is the distance from the second small cube to the camera when the first sample image is captured.
前面提到,根据所述第一样本图像计算得到的第二小立方体的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,这里的w(p)的表达式就是这一思想的一种可选表达,因此当把该权重值结合进来计算融合截断距离时,能够更准确地反映不同方位对该融合截断距离的影响。As mentioned above, the weight value of the second small cube calculated according to the first sample image is negatively correlated with the distance from the second small cube to the camera when the first sample image was captured, and / or, it is positively correlated with the first included angle, the expression of w(p) here is an optional expression of this idea, so when the weight value is combined to calculate the fusion cutoff distance, it can reflect more accurately The effect of different orientations on the fusion cutoff distance.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,所述根据第一样本图像计算得到的所述第二小立方体的截断距离d(p)满足如下关系:In combination with the second aspect or any of the above-mentioned possible implementation manners of the second aspect, in another optional solution of the second aspect, the second small cube calculated according to the first sample image is The cutoff distance d(p) satisfies the following relationship:
d(p)=sdf(p)/|u|d(p)=sdf(p)/|u|
其中,sdf(p)为拍摄第一样本图像时相机到所述第一小立方体的距离,与所述相机到所述场景中物体表面的距离的差,u为预设阈值。Wherein, sdf(p) is the difference between the distance from the camera to the first small cube when the first sample image is captured, and the distance from the camera to the surface of the object in the scene, and u is a preset threshold.
可以理解,这里的d(p)仅为截断距离的一种可选的计算公式,实际应用中还可以有其他的表达方式。It can be understood that d(p) here is only an optional calculation formula for the cut-off distance, and other expressions may also be used in practical applications.
结合第二方面或者第二方面的上述任一种可能的实现方式,在第二方面的又一种可选的方案中,若sdf(p)>|u|,则d(p)=1;若sdf(p)<0且|sdf(p)|>|u|,则d(p)=-1。With reference to the second aspect or any of the above possible implementation manners of the second aspect, in yet another optional solution of the second aspect, if sdf(p)>|u|, then d(p)=1; If sdf(p)<0 and |sdf(p)|>|u|, then d(p)=-1.
在这种方式中,将一范围内的小立方体的截断距离赋值为1,将另一范围内的小立方体的截断距离赋值为-1,这有利于后续对这两类小立方体做相同处理,从而提高计算效率。In this way, the truncation distance of the small cubes in one range is assigned as 1, and the truncation distance of the small cubes in another range is assigned as -1, which is conducive to the subsequent processing of the two types of small cubes. Thereby improving computational efficiency.
第三方面,本申请实施例提供一种光场预测模型的生成设备,包括处理器和存储器,其中,所述存储器用于存储计算机程序,所述计算机程序在所述处理器上运行时实现第一方面或者第一方面的任一种可选的方案所描述的方法。In a third aspect, an embodiment of the present application provides a device for generating a light field prediction model, including a processor and a memory, wherein the memory is used to store a computer program, and when the computer program runs on the processor, the first The method described in one aspect or any optional solution of the first aspect.
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中 存储有计算机程序,当所述计算机程序在处理器上运行时,实现第一方面或者第一方面的任一种可选的方案所描述的方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program runs on a processor, the first aspect or the first aspect is implemented Any of the alternatives of the method described.
通过实施本申请实施例,训练深度学习网络(也称光场预测模型,用于预测三维光场)所用到的voxel采样点,是基于图像的深度信息获取的,具体是基于深度信息和voxel到相机的距离计算出一个截断距离,然后根据截断距离的大小进行差异化采样,一方面,这种采样方式能够快速将采样集中到重点区域,提高了采样效率;另一方面,这种采样方式采样的voxel基本都集中在物体表面附近,因此后续基于这种voxel训练出的深度学习网络在进行图像预测时,能够更好的表征物体的纹理细节信息,能够减少模糊(blur)和结构错误的现象。By implementing the embodiments of the present application, the voxel sampling points used for training a deep learning network (also called a light field prediction model, used for predicting a three-dimensional light field) are obtained based on the depth information of the image, specifically based on the depth information and the voxel to The distance of the camera calculates a truncation distance, and then performs differential sampling according to the size of the truncation distance. On the one hand, this sampling method can quickly concentrate the sampling in key areas and improve the sampling efficiency; The voxel is basically concentrated near the surface of the object, so the subsequent deep learning network trained based on this voxel can better represent the texture details of the object when performing image prediction, and can reduce the phenomenon of blur and structural errors .
附图说明Description of drawings
图1是本申请实施例提供的一种获取NeRF的场景示意图;1 is a schematic diagram of a scenario for acquiring NeRF provided by an embodiment of the present application;
图2A是本申请实施例提供的一种采样voxel的场景示意图;2A is a schematic diagram of a sampling voxel scenario provided by an embodiment of the present application;
图2B是本申请实施例提供的一种三维光场和RGB信息随深度信息变化的示意图;2B is a schematic diagram of a three-dimensional light field and RGB information changing with depth information according to an embodiment of the present application;
图3是本申请实施例提供的一种模型训练的架构示意图;3 is a schematic diagram of the architecture of a model training provided by an embodiment of the present application;
图4是本申请实施例提供的一种确定场景三维光场的方法的流程示意图;4 is a schematic flowchart of a method for determining a three-dimensional light field of a scene provided by an embodiment of the present application;
图5是本申请实施例提供的一种相机到voxel和物体表面的距离示意图;5 is a schematic diagram of the distance from a camera to the voxel and the surface of an object provided by an embodiment of the present application;
图6是本申请实施例提供的一种截断距离的场景示意图;6 is a schematic diagram of a scenario of a cut-off distance provided by an embodiment of the present application;
图7是本申请实施例提供的一种截断距离的分布示意图;7 is a schematic diagram of the distribution of a truncation distance provided by an embodiment of the present application;
图8是本申请实施例提供的一种光场预测模型的预测效果的对比示意图;8 is a schematic diagram illustrating a comparison of prediction effects of a light field prediction model provided by an embodiment of the present application;
图9是本申请实施例提供的一种光场预测模型的生成装置的结构示意图;9 is a schematic structural diagram of an apparatus for generating a light field prediction model provided by an embodiment of the present application;
图10是本申请实施例提供的又一种光场预测模型的生成装置的结构示意图。FIG. 10 is a schematic structural diagram of another apparatus for generating a light field prediction model provided by an embodiment of the present application.
具体实施方式Detailed ways
下面结合本申请实施例中的附图对本申请实施例进行描述。The embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
请参见图1,图1为一种获取学习射线场(Neural Radiance Fields,NeRF)的场景示意图,图1所示的方法中,采用稀疏的图片数据集来合成复杂场景的三维光场,具体包括:如(a)部分,对于由五维(5D)坐标系表示的场景,将视场线(camera ray)上的单个五维(5D)坐标
Figure PCTCN2021080893-appb-000001
输入到全连接形式的深度学习网络,其中坐标
Figure PCTCN2021080893-appb-000002
包含了空间位置(x,y,z)与视角方向
Figure PCTCN2021080893-appb-000003
的信息。如(b)部分,深度学习网络重建(即输出)坐标
Figure PCTCN2021080893-appb-000004
对应的RGB信息,可表示为RGBσ,包括密集点(Density)和色彩(color)。如(c)部分,在RGBσ上进行体绘制(volume rendering)后,将其与坐标
Figure PCTCN2021080893-appb-000005
处的实际RGB信息进行比较,得出绘制损失(rendering loss),如(d)部分,并基于该rendering loss继续对深度学习网络进行训练。将camera ray上采集的空间点的5D坐标按照上述(a)、(b)、(c)、(d)的流程对深度学习网络训练完成后,该深度学习网络就可以预测新的5D坐标的RGB信息,因此,对于5D坐标系表示的场景来说,通过该深度学习网络可以预测从任意角度观看该场景的视图,该场景的全部视角的视图集合即为该场景的三维光场。
Please refer to FIG. 1. FIG. 1 is a schematic diagram of a scene for obtaining Neural Radiance Fields (NeRF). In the method shown in FIG. 1, a sparse image data set is used to synthesize a three-dimensional light field of a complex scene, which specifically includes : As in part (a), for a scene represented by a five-dimensional (5D) coordinate system, a single five-dimensional (5D) coordinate on the field of view (camera ray)
Figure PCTCN2021080893-appb-000001
input to a fully connected form of the deep learning network, where the coordinates
Figure PCTCN2021080893-appb-000002
Contains the spatial position (x, y, z) and the viewing direction
Figure PCTCN2021080893-appb-000003
Information. As in part (b), the deep learning network reconstructs (i.e. outputs) coordinates
Figure PCTCN2021080893-appb-000004
The corresponding RGB information, which can be expressed as RGBσ, includes density and color. As in part (c), after volume rendering on RGBσ, compare it with the coordinates
Figure PCTCN2021080893-appb-000005
The actual RGB information at the location is compared, and the rendering loss is obtained, such as part (d), and the deep learning network is continued to be trained based on the rendering loss. After training the deep learning network with the 5D coordinates of the spatial points collected on the camera ray according to the above (a), (b), (c), and (d) processes, the deep learning network can predict the new 5D coordinates. Therefore, for a scene represented by a 5D coordinate system, the deep learning network can predict the view of viewing the scene from any angle, and the view set of all viewing angles of the scene is the three-dimensional light field of the scene.
本申请发明人在对图1所示场景中用到的深度学习网络的运行原理进行分析后发现,对于某条视场线(camera ray,也称viewing ray)来说,如果它经过了物体表面(surface)上的点,则该camera ray对应的RGB信息(或者说像素)主要表征为该物体表面上的点的深度和颜色,在此前提下,上述深度学习网络在训练得到三维光场的过程中,首先需要从camera ray上均匀采样以得到采样点,如图2A所示,(e)部分和(f)部分都是均匀采样,其中(f)部分的采样粒度更小;然后,对各个采样点进行训练分析得出其中处于物体表面附近的采样点,基于物体表面附近的这部分采样点可以确定“物体表面”所处的大概范围,因此,基于该大概范围进一步训练分析得出物体表面的采样点;由于物体表面的采样点的深度和颜色基本能够反映camera ray对应的RGB信息(或者说像素),因此,基于物体表面的采样点可以训练出用于预测三维光场的深度学习网络(或者说光场预测模型)。如图2B所示,横向表示一条camera ray上图像深度信息的变化,纵向表示该条camera ray上三维光场随深度的变化量以及该条ray上RGB信息随深度的变化量,因此,用于预测三维光场的深度学习网络的权重和RGB信息在受深度影响方面,具有很高的重合性。而深度信息能够反映相机到物体表面的距离,因此,可以认为用于预测三维光场的深度学习网络的权重和RGB信息在受物体表面影响方面,具有很高的重合性。After analyzing the operation principle of the deep learning network used in the scene shown in Fig. 1, the inventor of the present application found that for a certain field of view (camera ray, also called viewing ray), if it passes through the surface of the object (surface) point, the RGB information (or pixel) corresponding to the camera ray is mainly characterized by the depth and color of the point on the surface of the object. In the process, it is first necessary to uniformly sample from the camera ray to obtain the sampling points. As shown in Figure 2A, parts (e) and (f) are uniformly sampled, and the sampling granularity of part (f) is smaller; then, for Each sampling point is trained and analyzed to obtain the sampling points near the surface of the object. Based on this part of the sampling points near the surface of the object, the approximate range of the "surface of the object" can be determined. Therefore, based on the approximate range, further training and analysis can be used to obtain the object. The sampling points of the surface; since the depth and color of the sampling points on the surface of the object can basically reflect the RGB information (or pixels) corresponding to the camera ray, the sampling points based on the surface of the object can be trained to predict the three-dimensional light field. Network (or light field prediction model). As shown in Figure 2B, the horizontal direction represents the change of the image depth information on a camera ray, the vertical direction represents the change amount of the 3D light field on the camera ray with the depth and the change amount of the RGB information on the ray with the depth. The weights and RGB information of the deep learning network for predicting 3D light fields have high coincidence in terms of being affected by depth. The depth information can reflect the distance from the camera to the surface of the object. Therefore, it can be considered that the weight and RGB information of the deep learning network used to predict the three-dimensional light field have a high coincidence in terms of being affected by the surface of the object.
在这个过程中,由于深度学习网络是从camera ray上均匀采样,因此,对于每条camera ray上的采样点来说,都需要经过逐个试错的方式来查找camera ray上位于物体表面附近的采样点,以及位于物体表面的采样点,这种逐个试错的方式导致深度学习网络的计算压力非常大,因此深度学习网络的收敛速度较慢,并且,这种逐个试错的方式也无法准确地定位到camera ray上位于物体表面的采样点,因此基于这样的采样点训练出的深度学习网络精度不高,从而导致该深度学习网络预测的三维光场的误差比较大。In this process, since the deep learning network is uniformly sampled from the camera ray, for each sampling point on the camera ray, a trial-and-error method is needed to find the samples on the camera ray that are located near the surface of the object. Points, and sampling points located on the surface of the object, this trial-and-error method leads to a very large computational pressure on the deep learning network, so the convergence speed of the deep learning network is slow, and this trial-and-error method cannot accurately The sampling points on the surface of the object are located on the camera ray, so the accuracy of the deep learning network trained based on such sampling points is not high, resulting in a relatively large error in the three-dimensional light field predicted by the deep learning network.
本申请发明人认为,深度信息能够反映相机到物体表面的距离,因此将深度信息导入整个深度学习网络,能够降低深度学习网络的无效计算,提升深度学习网络的收敛速度和效率。具体是利用该深度信息在每条camera ray上的深度值附近进行重点采样和训练,从而保证深度学习网络在训练初期在camera ray上迅速收敛到物体表面,并集中算力表征物体的纹理细节信息,避免模糊(blur)和结构错误的现象。The inventor of the present application believes that the depth information can reflect the distance from the camera to the surface of the object, so importing the depth information into the entire deep learning network can reduce the invalid computation of the deep learning network and improve the convergence speed and efficiency of the deep learning network. Specifically, the depth information is used to focus on sampling and training near the depth value of each camera ray, so as to ensure that the deep learning network quickly converges to the surface of the object on the camera ray in the early stage of training, and concentrates computing power to represent the texture details of the object. , to avoid blur and structural errors.
请参见图3,图3是本申请实施例提供的一种模型训练的架构示意图,该架构包括模型训练设备301和一个或多个模型使用设备302,该模型训练设备301与该模型使用设备302之间通过有线或者无线的方式进行通信,因此该模型训练设备301可以将训练出的用于预测三维光场的深度学习网络(或者光场预测模型)发送给模型使用设备302;相应地,模型使用设备302通过接收到的深度学习网络来预测特定场景中的三维光场。当然,也可能模型训练设备自己基于训练得到的深度学习网络预测场景中的三维光场。Please refer to FIG. 3. FIG. 3 is a schematic diagram of a model training architecture provided by an embodiment of the present application. The architecture includes a model training device 301 and one or more model using devices 302. The model training device 301 and the model using device 302 Communication between them is carried out by wire or wireless, so the model training device 301 can send the trained deep learning network (or light field prediction model) for predicting the three-dimensional light field to the model using device 302; accordingly, the model The device 302 is used to predict the three-dimensional light field in a particular scene through the received deep learning network. Of course, it is also possible that the model training device itself predicts the three-dimensional light field in the scene based on the deep learning network obtained by training.
可选的,该模型使用设备302可以将基于该模型预测的结果反馈给上述模型训练设备301,使得模型训练设备301可以进一步基于该模型使用设备302的预测结果对模型做进一步的训练;重新训练好的模型可以发送给模型使用设备302对原来的模型进行更新。Optionally, the model using device 302 can feed back the result predicted based on the model to the above-mentioned model training device 301, so that the model training device 301 can further train the model based on the prediction result of the model using device 302; retraining; The good model can be sent to the model usage device 302 to update the original model.
该模型训练设备301可以为具有较强计算能力的设备,例如,一个服务器,或者由多个服务器组成的服务器集群。The model training device 301 may be a device with relatively strong computing power, for example, a server, or a server cluster composed of multiple servers.
该模型使用设备302为需要获取特定场景的三维光场的设备,例如,手持设备(例如,手机、平板电脑、掌上电脑等)、车载设备(例如,汽车、自行车、电动车、飞机、船舶等)、可穿戴设备(例如智能手表(如iWatch等)、智能手环、计步器等)、智能家居设备(例如,冰箱、电视、空调、电表等)、智能机器人、车间设备,等等。The model-using device 302 is a device that needs to acquire a three-dimensional light field of a specific scene, such as a handheld device (eg, a mobile phone, a tablet computer, a PDA, etc.), a vehicle-mounted device (eg, a car, a bicycle, an electric vehicle, an airplane, a ship, etc.) ), wearable devices (such as smart watches (such as iWatch, etc.), smart bracelets, pedometers, etc.), smart home devices (such as refrigerators, TVs, air conditioners, electricity meters, etc.), smart robots, workshop equipment, etc.
下面分别以模型使用设备302为汽车和手机为例进行举例说明。The following is an example of the model using device 302 being a car and a mobile phone, respectively.
例如,随着经济的发展,全球汽车数量的不断增加,在提高汽车在道路上的通行效率方面,导航地图起着非常关键的作用;在一些道路复杂的路面,用户往往难以获取路面的全面信息,但是,采用本申请实施例的方法可预测特定场景的三维光场,因此可以向用户呈现复杂路面从各个方位观看的效果,有利于用户据此进行驾驶控制,提高汽车通行效率。For example, with the development of the economy and the increasing number of cars around the world, navigation maps play a very critical role in improving the efficiency of cars on the road; on some complex roads, it is often difficult for users to obtain comprehensive information on the road. However, using the method of this embodiment of the present application can predict the three-dimensional light field of a specific scene, so the user can be presented with the effect of viewing complex roads from various directions, which is beneficial for the user to control driving and improve the efficiency of vehicle traffic.
再如,现在网络购物非常普遍,消费者通过在网上查看物品的照片来了解物品的产品形态,然而目前很多物品的照片很有限,消费者只能看到从部分方位观看该物品时的效果,但是,采用本申请实施例的方法可预测物品的三维光场,因此用户可以全视角地观看该物品的产品形态,有利于帮助用户挑选到更适合自己的产品。Another example is that online shopping is very common now. Consumers know the product form of the item by viewing the photos of the item online. However, the photos of many items are very limited at present, and consumers can only see the effect of viewing the item from some directions. However, using the method of the embodiment of the present application can predict the three-dimensional light field of the item, so the user can view the product form of the item from a full perspective, which is beneficial to help the user choose a more suitable product.
需要预测三维光场的例子还有很多,如VR看房、VR电影、游戏、街景制作等。There are many more examples where 3D light fields need to be predicted, such as VR house viewing, VR movies, games, street scene production, etc.
请参见图4,图4是本申请实施例提供的一种确定场景三维光场的方法的流程示意图,该方法可以基于图3所示的架构来实现,也可以基于其他架构来实现。当基于图3所示的架构实现时,步骤S400-S405可以由模型训练设备301实现,步骤S406可以由模型使用设备302实现。当基于其他架构来实现时,步骤S400-S406可以由一个设备来完成,也可以由多个设备来协作完成,该一个设备或者该多个设备的应用领域此处不做限定,能够提供相应的计算能力,和/或通信能力即可。步骤S400-S406具体如下:Please refer to FIG. 4. FIG. 4 is a schematic flowchart of a method for determining a three-dimensional light field of a scene provided by an embodiment of the present application. The method may be implemented based on the architecture shown in FIG. 3, or may be implemented based on other architectures. When implemented based on the architecture shown in FIG. 3 , steps S400 - S405 may be implemented by the model training device 301 , and step S406 may be implemented by the model using device 302 . When implemented based on other architectures, steps S400-S406 can be completed by one device, or can be completed by multiple devices in cooperation. The application field of the one device or the multiple devices is not limited here, and corresponding devices can be provided. Computing power, and/or communication capabilities are sufficient. Steps S400-S406 are as follows:
步骤S400:向深度学习网络输入多张样本图像和该多张样本图像的拍摄方位的信息。Step S400: Inputting a plurality of sample images and information of shooting orientations of the plurality of sample images into the deep learning network.
其中,多张样本图像是通过相机在不同的方位拍摄的同一个场景的图像,可选的,该方位(Pose)包括位置坐标(x,y,z)以及视角方向
Figure PCTCN2021080893-appb-000006
例如,如果以世界坐标系为参照,该位置坐标(x,y,z)中的x、y、z分别代表经度、纬度、海拔高度,该视角方向
Figure PCTCN2021080893-appb-000007
中的θ、
Figure PCTCN2021080893-appb-000008
分别代表水平角度和垂直角度。当然,该方位还可以通过其他方式表达。
The multiple sample images are images of the same scene captured by the camera at different orientations. Optionally, the orientation (Pose) includes position coordinates (x, y, z) and a viewing angle direction.
Figure PCTCN2021080893-appb-000006
For example, if the world coordinate system is used as a reference, x, y, and z in the position coordinates (x, y, z) represent longitude, latitude, and altitude respectively.
Figure PCTCN2021080893-appb-000007
θ in ,
Figure PCTCN2021080893-appb-000008
represent the horizontal and vertical angles, respectively. Of course, the orientation can also be expressed in other ways.
步骤S401:根据多张样本图像拍摄时的拍摄方位建立包围被拍摄场景的立方体模型。Step S401 : establishing a cube model surrounding the captured scene according to the shooting orientation of the multiple sample images when shooting.
可以理解,可以基于相机拍摄这多张照片时的多个不同方位构建一个能够包围该场景的立方体模型,可选的,该立方体模型的长、宽、高分别为由该多个不同方位计算出的该场景长C、宽W、高H的最大值。这里的场景不作限定,例如,可以是一个以人为主要物体的场景,再如,可以是以一棵树为主要物体的场景,再如,可以是以房屋内部结构为主要景物的场景,等等。这里提及的多个不同方位可以为东、西、南、北四个方位,也可以为其他参照物下的不同方位。It can be understood that a cube model that can surround the scene can be constructed based on multiple different orientations when the camera takes the multiple photos. Optionally, the length, width, and height of the cube model are calculated from the multiple orientations. The maximum value of length C, width W, and height H of the scene. The scene here is not limited, for example, it can be a scene with people as the main object, another example, it can be a scene with a tree as the main object, another example, it can be a scene with the internal structure of the house as the main scene, etc. . The multiple different orientations mentioned here may be four orientations of east, west, south, and north, or may be different orientations under other reference objects.
该立方体模型被划分为多个小立方体(grid voxel),简称voxel,可选的,该voxel可以通过对该立方体模型N等分划分得到,其大小可以根据实际需要进行设定,可以理解,当voxel越小时有利于提高后续深度学习网络的训练精度,但是voxel越小也会导致深度学习网络训练时的计算压力,通常会结合精度和设备算力综合考虑voxel的大小,例如,可以将voxel的规格(size)设置为2厘米(cm)。一般来说,对于任意一个voxel,认为它要 么位于上述场景中的物体的表面,要么不位于物体的表面(可以认为是位于场景中的空旷位置)。The cube model is divided into a plurality of small cubes (grid voxel), referred to as voxel. Optionally, the voxel can be obtained by dividing the cube model into N equal parts, and its size can be set according to actual needs. It can be understood that when The smaller the voxel, the better the training accuracy of the subsequent deep learning network, but the smaller the voxel will also lead to the computational pressure during deep learning network training. Usually, the size of the voxel is considered in combination with the accuracy and the computing power of the device. For example, the voxel can be The size is set to 2 centimeters (cm). Generally speaking, for any voxel, it is considered that it is either located on the surface of the object in the above scene, or it is not located on the surface of the object (it can be considered to be located in an open position in the scene).
可选的,若立方体模型中的voxel的位置表示为三维位置坐标g,即(x,y,z),后续立方体模型中的voxel送入深度学习网络进行训练时,可以由一个GPU线程处理一个(x,y)上的voxel,即一个GPU进程扫描处理一个(x,y)坐标下的晶格柱。可选的,一般会将voxel的中心点的坐标作为该voxel的坐标,当然也可以将该voxel中的其他点的坐标作为该voxel的坐标,比如,该voxel的左上角、该voxel的右上角等。Optionally, if the position of the voxel in the cube model is expressed as the three-dimensional position coordinate g, that is (x, y, z), when the voxel in the subsequent cube model is sent to the deep learning network for training, a GPU thread can process one. A voxel on (x, y), that is, a GPU process scans a lattice column at (x, y) coordinates. Optionally, the coordinates of the center point of the voxel are generally used as the coordinates of the voxel. Of course, the coordinates of other points in the voxel can also be used as the coordinates of the voxel, for example, the upper left corner of the voxel and the upper right corner of the voxel. Wait.
步骤S402:根据多张样本图像分别计算所述多个小立方体中每个小立方体的多个截断距离。Step S402: Calculate a plurality of cut-off distances for each small cube in the plurality of small cubes according to the plurality of sample images.
上述多张样本图像是相机在不同的方位拍摄的,因此,需要针对每个方位拍摄的样本图像分别计算立方体模型中每个小立方体的截断距离,为了便于理解,下面以第一样本图像为例进行说明,该第一样本图像为该多张样本图像中的任意一张。其中,根据第一样本图像计算得到的每个小立方体的截断距离,是根据拍摄第一样本图像时相机到所述每个小立方体的距离与所述相机到所述场景中物体表面的距离确定的,下面进行举例说明。The above multiple sample images are taken by the camera in different orientations. Therefore, it is necessary to calculate the cutoff distance of each small cube in the cube model for the sample images taken in each orientation. For ease of understanding, the first sample image is taken as For illustration, the first sample image is any one of the plurality of sample images. The cut-off distance of each small cube calculated according to the first sample image is calculated according to the distance between the camera and each small cube when the first sample image was captured and the distance between the camera and the surface of the object in the scene. The distance is determined, and an example is given below.
假设第一样本图像是相机以第一方位拍摄的,相机在第一方位拍摄时也会采集场景的深度信息,该深度信息反映了从相机到该场景中物体表面的距离,因此相机在第一方位拍摄得到的图像中,每个像素点x对应了一个深度值,该像素点x对应的深度值value(x)反映了该像素点x所对应的camera ray上的位于物体表面的voxel到该相机的距离value(x)。Assuming that the first sample image is taken by the camera in the first orientation, the camera will also collect the depth information of the scene when shooting in the first orientation. The depth information reflects the distance from the camera to the surface of the object in the scene, so the camera is in the In the image captured in one direction, each pixel x corresponds to a depth value, and the depth value (x) corresponding to the pixel x reflects the voxel to the surface of the object on the camera ray corresponding to the pixel x. The distance value(x) of this camera.
下面以第二小立方体(立方体模型中的任意一个voxel)为例进行说明,其位置是已知的,而相机以第一方位拍摄时,该相机的位置也是已知的,因此,对于第二小立方体来说,当相机以第一方位拍摄时,该第二小立方体到相机的距离是已知的,可以记为distance(v)。The following takes the second small cube (any voxel in the cube model) as an example, its position is known, and when the camera shoots in the first orientation, the position of the camera is also known. Therefore, for the second For the small cube, when the camera shoots in the first orientation, the distance from the second small cube to the camera is known and can be recorded as distance(v).
因此,当相机以第一方位拍摄时,立方体模型中的第二小立方体到该场景中物体表面的距离可以标记为:sdf(p)=value(x)-distance(v),具体的场景及几何关系如图5所示。Therefore, when the camera shoots in the first orientation, the distance from the second small cube in the cube model to the surface of the object in the scene can be marked as: sdf(p)=value(x)-distance(v). The specific scene and The geometric relationship is shown in Figure 5.
本申请实施例要重点突出物体表面附近的voxel,假设与物体表面的距离不超过预设阈值u认为是位于物体表面附近,那么voxel是否位于附近以及近到什么程度可以通过截断距离d(p)表示,其中,第二小立方体voxel的截断距离d(p)的计算方式可以如下:In the embodiment of this application, the voxel near the surface of the object should be highlighted. Assuming that the distance from the surface of the object does not exceed the preset threshold u, it is considered to be located near the surface of the object. Then, whether the voxel is located nearby and how close it is can be determined by the truncation distance d(p) represents, wherein, the calculation method of the cut-off distance d(p) of the second small cube voxel can be as follows:
d(p)=sdf(p)/|u|d(p)=sdf(p)/|u|
可选的,若sdf(p)>|u|,则d(p)=1;若sdf(p)<0且|sdf(p)|>|u|,则d(p)=-1。图6示意了截断距离d(p)与u之间的函数关系。图7示意了相机拍摄场景对应的立方体模型中的部分小立方体voxel的截断距离d(p)的分布。Optionally, if sdf(p)>|u|, then d(p)=1; if sdf(p)<0 and |sdf(p)|>|u|, then d(p)=-1. Figure 6 illustrates the functional relationship between the truncation distance d(p) and u. FIG. 7 illustrates the distribution of the cutoff distance d(p) of some small cube voxels in the cube model corresponding to the scene captured by the camera.
在一种可选的方案中,还可以计算每个voxel的权重值,根据第一样本图像计算得到的第二小立方体的权重值,与第二小立方体到拍摄第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,所述第一夹角为所述第二小立方体所在的视场线camera ray与距离所述第二小立方体最近的物体表面的法向量之间的夹角,这个权重值用于后续对第二小立方体的截断距离进行融合计算。采用这种方式确定每个voxel的权重值,其原因是:当相机以第一方位拍摄时,对于上述立方体模型中的任意一个voxel来说,它的像素信息(包括密集点(Density)、颜色(color)等)受多个方面影响,例如,它距离相机越近则像素信息越多,对应的第一夹角越小(相当于偏离相机拍摄的角度越小)则像素信息越 多。In an optional solution, the weight value of each voxel can also be calculated. The distance of the camera is negatively correlated, and/or, is positively correlated with the first included angle, and the first included angle is the line of view camera ray where the second small cube is located and the closest distance to the second small cube. The angle between the normal vectors of the surface of the object. This weight value is used for the subsequent fusion calculation of the cutoff distance of the second small cube. The reason for determining the weight value of each voxel in this way is: when the camera shoots in the first orientation, for any voxel in the above-mentioned cube model, its pixel information (including density, color, etc. (color, etc.) is affected by many aspects, for example, the closer it is to the camera, the more pixel information, and the smaller the corresponding first angle (equivalent to the smaller the angle deviated from the camera), the more pixel information.
可选的,该第二小立方体voxel的权重值w(p)的表达式可以如下:Optionally, the expression of the weight value w(p) of the second small cube voxel can be as follows:
w(p)=cos(θ)/distance(v)w(p)=cos(θ)/distance(v)
其中,θ为所述第一夹角,distance(v)为所述第二小立方体到拍摄所述第一样本图像时所述相机的距离。Wherein, θ is the first included angle, and distance(v) is the distance from the second small cube to the camera when the first sample image is captured.
基于以上描述可知,基于上述第一样本图像,可以计算出该立方体模型中每个voxel的截断距离d(p)和权重值w(p),这一组截断距离d(p)和权重值w(p)是相对于第一方位的参数。基于其他样本图像,采用同样的计算原理,也可以计算出该立方体模型中每个voxel的截断距离d(p)和权重值w(p),这组截断距离d(p)和权重值w(p)是相对于另一方位的参数。Based on the above description, based on the above-mentioned first sample image, the cut-off distance d(p) and weight value w(p) of each voxel in the cube model can be calculated. This set of cut-off distance d(p) and weight value w(p) is a parameter relative to the first orientation. Based on other sample images, using the same calculation principle, the truncation distance d(p) and weight value w(p) of each voxel in the cube model can also be calculated. This set of truncation distance d(p) and weight value w( p) is a parameter relative to another orientation.
需要说明的是,在计算每个voxel的截断距离d(p)和权重值w(p)之前,可以先将voxel的坐标和相机的坐标统一到一个坐标系下,以方便计算。例如,可以根据voxel的大小,以及voxel的数目,将立方体模型中每个voxel的位置g转换为世界坐标系下的位置点p,然后根据相机位姿矩阵确定世界坐标系下的位置点p在相机坐标系下的映射点v,并根据相机内参矩阵和映射点v点确定深度图像中的对应像素点x;然后,获取像素点x的深度值,像素点x的深度值即为立方体模型中位置g处的voxel所在的视场线camera ray上,位于物体表面的voxel到相机的距离value(x),并且将映射点v到相机坐标系的原点的距离记为distance(v);获得的value(x)和distance(v)就可以用于截断距离d(p)和权重值w(p)的计算。It should be noted that before calculating the cut-off distance d(p) and weight value w(p) of each voxel, the coordinates of the voxel and the camera can be unified into one coordinate system to facilitate the calculation. For example, according to the size of the voxel and the number of voxels, the position g of each voxel in the cube model can be converted into the position point p in the world coordinate system, and then the position point p in the world coordinate system can be determined according to the camera pose matrix. The mapping point v in the camera coordinate system, and the corresponding pixel point x in the depth image is determined according to the camera internal parameter matrix and the mapping point v point; then, the depth value of the pixel point x is obtained, and the depth value of the pixel point x is in the cube model. On the field of view camera ray where the voxel at position g is located, the distance value(x) from the voxel on the surface of the object to the camera, and the distance from the mapping point v to the origin of the camera coordinate system is recorded as distance(v); obtained value(x) and distance(v) can then be used to calculate the truncation distance d(p) and the weight value w(p).
在一种可选的方案中,在拍摄所述多张样本图像中每张样本图像时,从所述每张样本图像的拍摄视角采集深度信息,其中,深度信息用于表征所述相机到所述场景中物体表面的距离,即上述value(x)。该深度信息可以具体通过传感器采集,例如,该传感器可以为雷达传感器、红外传感器等传感器,该传感器可以部署在该相机上面或该相机附近。In an optional solution, when shooting each sample image in the plurality of sample images, depth information is collected from the shooting angle of the each sample image, wherein the depth information is used to represent the distance from the camera to all the sample images. The distance to the surface of the object in the scene, that is, the above value(x). The depth information may be specifically collected by a sensor, for example, the sensor may be a radar sensor, an infrared sensor or the like, and the sensor may be deployed on or near the camera.
步骤S403:对根据所述多张样本图像分别计算得到的截断距离进行融合计算,得到所述立方体模型中的小立方体的融合截断距离。Step S403: Perform fusion calculation on the cut-off distances calculated according to the plurality of sample images respectively, to obtain the fusion cut-off distance of the small cubes in the cube model.
在第一种可选方案中,具体是对所述每个小立方体的所述多个截断距离进行融合计算,得到所述每个小立方体融合截断距离;后续进行采样时,就是根据所述每个小立方体的融合截断距离从所述每个小立方体中采样空间点。In the first optional solution, the fusion calculation is performed on the multiple truncation distances of each small cube to obtain the fusion truncation distance of each small cube; in subsequent sampling, it is based on the The fusion cutoff distance of the small cubes samples the spatial points from each of the small cubes.
在第二种可选方案中,具体是确定所述每个小立方体中至少有一个截断距离的绝对值小于预设阈值的第一小立方体;然后对根据所述多张样本图像分别计算得到的第一小立方体的截断距离进行融合计算,得到所述立方体模型中的第一小立方体的融合截断距离,例如,如果基于多个样本图像计算出了一个voxel的多个截断距离,那么,如果该多个截断距离中最小的一个截断距离小于该预设阈值,则认为该一个voxel为第一小立方体,例如,预设阈值可以设置为1。后续进行采样时,就是根据所述第一小立方体的融合截断距离从所述第一小立方体中采样空间点。In the second optional solution, it is specifically determined that each small cube has at least one first small cube whose absolute value of the truncation distance is less than a preset threshold; The truncation distance of the first small cube is fused to calculate the fusion truncation distance of the first small cube in the cube model. For example, if multiple truncation distances of a voxel are calculated based on multiple sample images, then if the If the smallest cutoff distance among the plurality of cutoff distances is smaller than the preset threshold, the one voxel is considered to be the first small cube. For example, the preset threshold may be set to 1. During subsequent sampling, spatial points are sampled from the first small cube according to the fusion cutoff distance of the first small cube.
下面以第二种可选方案为例,对融合计算的原理进行举例说明。The following takes the second optional solution as an example to illustrate the principle of fusion computing.
可选的,融合计算的原理可以为,第二小立方体的融合截断距离是将基于不同样本图像计算得到的截断距离进行加权平均计算得到的,所述第二小立方体为所述立方体模型中的任意一个小立方体。基于所述第一样本图像计算得到的第二小立方体的截断距离在加权平均计算时所占的权重值等于前面的“基于所述第一样本图像计算得到的所述第二小立方 体的权重值w(p)”。举例来说,针对多张样本图像执行如下操作:将基于其中一个样本图像计算得到的截断距离d(p)作为初始的融合截断距离D(p),以及将基于该一个样本图像计算得到的权重值w(p)作为初始的融合权重值W(p);然后,针对其他样本图像依次执行如下操作,将基于当前样本图像计算得到的截断距离d(p)融入已有的融合截断距离D(p)以对该融合截断距离D(p)进行更新,以及基于当前样本图像计算得到的权重值w(p)融入已有的融合权重值W(p)以对该融合权重值W(p)进行更新,直至对基于每个样本图像计算出的截断距离d(p)和权重值w(p)都融合完毕。Optionally, the principle of fusion calculation may be that the fusion cut-off distance of the second small cube is obtained by performing a weighted average calculation of cut-off distances calculated based on different sample images, and the second small cube is the one in the cube model. Any small cube. The weight value occupied by the cut-off distance of the second small cube calculated based on the first sample image in the weighted average calculation is equal to the previous "The second small cube calculated based on the first sample image". Weight value w(p)". For example, the following operations are performed for multiple sample images: the cutoff distance d(p) calculated based on one of the sample images is used as the initial fusion cutoff distance D(p), and the weight calculated based on the one sample image is used as the initial fusion cutoff distance D(p) The value w(p) is used as the initial fusion weight value W(p); then, for other sample images, the following operations are performed in turn, and the truncation distance d(p) calculated based on the current sample image is integrated into the existing fusion truncation distance D( p) to update the fusion cut-off distance D(p), and integrate the weight value w(p) calculated based on the current sample image into the existing fusion weight value W(p) to the fusion weight value W(p) Update until the truncation distance d(p) and weight value w(p) calculated based on each sample image are fused.
可选的,融合计算的表达式可以如下:Optionally, the expression of fusion calculation can be as follows:
D(p)=(W(p)*D(p)+w(p)d(p))/(W(p)+w(p))D(p)=(W(p)*D(p)+w(p)d(p))/(W(p)+w(p))
W(p)=W(p)+w(p)W(p)=W(p)+w(p)
其中,D(p)为第二小立方体的融合截断距离,W(p)为第二小立方体的融合权重值,d(p)为基于当前样本图像计算出的第二小立方体的截断距离,w(p)为基于当前样本图像计算出的该第二小立方体的权重值。Among them, D(p) is the fusion cut-off distance of the second small cube, W(p) is the fusion weight value of the second small cube, d(p) is the cut-off distance of the second small cube calculated based on the current sample image, w(p) is the weight value of the second small cube calculated based on the current sample image.
可以理解,通过这种方式可以计算出该立方体模型中全部voxel最终的融合截断距离D(p)和融合权重值W(p)。可选的,可将立方体模型中全部voxel最终的融合截断距离D(p)和融合权重值W(p)输入到Marching Cube中,计算出三角面,以呈现立方体模型的截断距离场。It can be understood that the final fusion cut-off distance D(p) and fusion weight value W(p) of all voxels in the cube model can be calculated in this way. Optionally, the final fusion cut-off distance D(p) and fusion weight value W(p) of all voxels in the cube model can be input into Marching Cube, and the triangular faces can be calculated to present the cut-off distance field of the cube model.
步骤S404:根据立方体模型中的小立方体的融合截断距离从小立方体中采样空间点。Step S404: Sample spatial points from the small cubes according to the fusion cutoff distance of the small cubes in the cube model.
具体地,根据立方体模型中的voxel的融合截断距离,从该立方体模型中的voxel中采样空间点,针对步骤S403中的第一种可选方案,可以基于每个voxel的融合截断距离从每个voxel中采样空间点,针对步骤S403中的第二种可选方案,可以基于第一小立方体(即第一voxel)的融合截断距离从第一小立方体中采样空间点。Specifically, according to the fusion cut-off distance of the voxel in the cube model, the spatial point is sampled from the voxel in the cube model, and for the first optional solution in step S403, the fusion cut-off distance of each voxel can be based on each voxel. The spatial points are sampled in the voxel, for the second optional solution in step S403, the spatial points can be sampled from the first small cube based on the fusion cutoff distance of the first small cube (ie, the first voxel).
采样的思路是,所述立方体模型中融合截断距离越小的小立方体采样的空间点越多;例如,对于任意一个相平面来说,该相平面发出的camera ray上,融合截断距离D(p)越接近0的voxel采样的密度越大,采样的次数越多,如果融合截断距离D(p)越接近1或-1则对这样的voxel则采样密度越小,采样次数越少。可选的,所述立方体模型中融合截断距离大于或等于预设阈值的立方体的采样数为零,例如,对于融合截断距离D(p)越等于1或等于-1的voxel,可以完全不进行采样。The idea of sampling is that in the cube model, the smaller the fusion cutoff distance is, the more space points are sampled by the small cube; for example, for any phase plane, on the camera ray emitted by the phase plane, the fusion cutoff distance D(p ), the closer the voxel is to 0, the greater the sampling density and the more sampling times. If the fusion cut-off distance D(p) is closer to 1 or -1, the sampling density is smaller and the sampling times less for such voxels. Optionally, the number of samples of cubes whose fusion cut-off distance is greater than or equal to a preset threshold in the cube model is zero. For example, for a voxel whose fusion cut-off distance D(p) is more equal to 1 or equal to -1, it may not be performed at all. sampling.
可选的,camera ray上的voxel的采样次数与该voxel的融合截断距离D(p)满足如下关系:Optionally, the sampling times of the voxel on the camera ray and the fusion cutoff distance D(p) of the voxel satisfy the following relationship:
n∝(1-|D(p)|)n∝(1-|D(p)|)
其中,D(p)为该voxel的融合截断距离,n为该voxel的采样次数。Among them, D(p) is the fusion cutoff distance of the voxel, and n is the sampling times of the voxel.
需要说明的是,每个voxel在立方体空间中并非一个点,而是包含了大量的空间点,因此,通过以上方式计算出每个voxel的采样次数之后,可以从该voxel包含的大量空间点中采样一部分空间点;举例来说,取全部voxel中的10个voxel为例,假若每个voxel包括1000个空间点,这10个voxel分别表示为voxel-1、voxel-2、voxel-3、voxel-4、voxel-5、voxel-6、voxel-7、voxel-8、voxel-9、voxel-10,这10个voxel融合截断距离D(p)依次为0.1、0.2、0.3、0.4、0.5、0.6、0.7、0.8、0.9、1,那么,从voxel-1中采样的空间点的数 量为90、从voxel-2中采样的空间点的数量为80、从voxel-3中采样的空间点的数量为70、从voxel-4中采样的空间点的数量为60、从voxel-5中采样的空间点的数量为50、从voxel-6中采样的空间点的数量为40、从voxel-7中采样的空间点的数量为30、从voxel-8中采样的空间点的数量为20、从voxel-9中采样的空间点的数量为10、从voxel-10中采样的空间点的数量为0。采样得到的每个空间点都有一个空间坐标(x,y,z)。It should be noted that each voxel is not a point in the cube space, but contains a large number of spatial points. Therefore, after calculating the sampling times of each voxel through the above method, it can be obtained from the large number of spatial points contained in the voxel. Sampling a part of the spatial points; for example, take 10 voxels in all voxels as an example, if each voxel includes 1000 spatial points, these 10 voxels are respectively expressed as voxel-1, voxel-2, voxel-3, voxel -4, voxel-5, voxel-6, voxel-7, voxel-8, voxel-9, voxel-10, the 10 voxel fusion truncation distances D(p) are 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, then the number of spatial points sampled from voxel-1 is 90, the number of spatial points sampled from voxel-2 is 80, and the number of spatial points sampled from voxel-3 is 90. The number is 70, the number of spatial points sampled from voxel-4 is 60, the number of spatial points sampled from voxel-5 is 50, the number of spatial points sampled from voxel-6 is 40, and the number of spatial points sampled from voxel-7 is 40. The number of spatial points sampled from voxel-8 is 30, the number of spatial points sampled from voxel-8 is 20, the number of spatial points sampled from voxel-9 is 10, and the number of spatial points sampled from voxel-10 is 0. Each spatial point sampled has a spatial coordinate (x, y, z).
本申请实施例中,有些camera ray上的voxel到相机的距离是无法被相关传感器检测到的,因此对于这类voxel是无法获得其截断距离的,因此也无法基于截断距离进行采样。对于这类voxel可以采用传统的均匀采样(或者平均采样)的方式来采样。In the embodiments of the present application, the distance between some voxels on the camera ray and the camera cannot be detected by the relevant sensors, so the cutoff distance cannot be obtained for such voxels, so sampling based on the cutoff distance cannot be performed. For this type of voxel, traditional uniform sampling (or average sampling) can be used for sampling.
步骤S405:通过采样的空间点的空间坐标和多张样本图像训练所述深度学习网络。Step S405: Train the deep learning network by using the spatial coordinates of the sampled spatial points and multiple sample images.
该多张样本图像分别对应了多个方位,该深度学习网络会根据采样的空间点的空间坐标来重建(或者计算)该多个方位中每个方位的图像,直至重建的每个方位的图像与该方位原本的样本图像之间的损失(例如RGB信息差异)小于一个预设值,重建的每个方位的图像与该方位原本的样本图像之间的损失小于该预设值,则针对深度学习网络的训练结束。The multiple sample images correspond to multiple orientations respectively, and the deep learning network will reconstruct (or calculate) the image of each orientation in the multiple orientations according to the spatial coordinates of the sampled spatial points, until the reconstructed image of each orientation The loss (such as RGB information difference) from the original sample image of the azimuth is less than a preset value, and the loss between the reconstructed image of each azimuth and the original sample image of the azimuth is less than the preset value, then for the depth The training of the learning network is over.
下面针对重建过程进行举例说明。The following is an example for the reconstruction process.
可选的,样本图像1对应的是从方位1拍摄的,那么,将图像1中的像素点A的像素值,添加到像素点A对应的camera ray上被采样到的空间点的空间坐标上,针对该图像1中的其他像素点也做类似操作,对图像1的所有像素点都做完这个操作后,能够呈现出一个图像效果,这个图像就是重建的方位1的图像A,重建的理想效果就是,该深度学习网络重建的方位1的图像A与该样本图像1之间的损失(例如RGB信息差异)小于该预设值,其他方位拍摄的样本图像和重建图像也均满足这个关系。该训练结束后的深度学习网络也可以称为光场预测模型。Optionally, the sample image 1 is taken from the orientation 1, then, add the pixel value of the pixel point A in the image 1 to the spatial coordinates of the sampled spatial point on the camera ray corresponding to the pixel point A. , and perform similar operations on other pixels in the image 1. After this operation is performed on all the pixels in the image 1, an image effect can be presented. This image is the reconstructed image A of the orientation 1. The ideal reconstruction The effect is that the loss (eg, RGB information difference) between the image A reconstructed by the deep learning network in orientation 1 and the sample image 1 is less than the preset value, and the sample images and reconstructed images captured in other orientations also satisfy this relationship. The deep learning network after the training can also be called a light field prediction model.
步骤S406:通过该深度学习网络(光场预测模型)预测从新的方位拍摄上述场景时的图像。Step S406: Predict the image when the above scene is shot from a new orientation through the deep learning network (light field prediction model).
具体地,此处的新的方位可以为除上述多个方位以外的其他方位,该方位可以通过五维(5D)坐标表示,例如,方位1
Figure PCTCN2021080893-appb-000009
其中,(x,y,z)表示位置,
Figure PCTCN2021080893-appb-000010
表示视角方向。可以理解,采用这种方式,可以预测从任意方位拍摄上述场景时的图像,一般来说,当任意方位拍摄上述场景时的图像都获取了,则认为获取了该场景的三维光场。
Specifically, the new orientation here can be other orientations except the above-mentioned multiple orientations, and the orientation can be represented by five-dimensional (5D) coordinates, for example, orientation 1
Figure PCTCN2021080893-appb-000009
where (x, y, z) represents the position,
Figure PCTCN2021080893-appb-000010
Indicates the viewing angle direction. It can be understood that, in this way, the image of the above-mentioned scene can be predicted from any direction. Generally speaking, when the above-mentioned scene is captured from any direction, the three-dimensional light field of the scene is considered to be obtained.
可选的,该预测的图像可以为RGB图像。Optionally, the predicted image may be an RGB image.
表1为图1示意的现有技术中训练好的深度学习网络的预测效果,与本申请中训练好深度学习网络的预测效果的对比情况,可以通过仿真输出PSNR对比结果:Table 1 is the prediction effect of the well-trained deep learning network in the prior art illustrated in Fig. 1, and the contrast situation of the prediction effect of the well-trained deep learning network in the application, the PSNR comparison result can be output by simulation:
表1Table 1
网络类型Network Type 现有技术current technology 本申请实施例Examples of this application
PSNR-CoarsePSNR-Coarse 3333 3737
PSNR-FinePSNR-Fine 3636 37.237.2
PSNR-TestPSNR-Test 35.535.5 36.536.5
在表1中,峰值信噪比(Peak signal-to-noise ratio,PSNR)是用于衡量预测效果的指标,PSNR-Coarse表示对已有的样本图像对应的方位的图像进行粗粒度预测时的效果,PSNR-Fine表示对已有的样本图像对应的方位的拍摄图像进行细粒度预测时的效果, PSNR-Test表示对一个新的方位的拍摄图像进行预测时的效果。峰值信噪比PSNR的值越大,则表明训练好的深度学习网络的预测效果越好。In Table 1, the peak signal-to-noise ratio (PSNR) is an indicator used to measure the prediction effect, and PSNR-Coarse represents the coarse-grained prediction of the existing sample images corresponding to the azimuth image. Effect, PSNR-Fine indicates the effect of fine-grained prediction on the captured image of the azimuth corresponding to the existing sample image, and PSNR-Test indicates the effect of predicting the captured image of a new azimuth. The larger the value of PSNR, the better the prediction effect of the trained deep learning network.
如图8所示,图8示意了以上现有技术中训练好的深度学习网络的预测效果,与本申请中训练好的深度学习网络的预测效果的对比情况,这个结果以图像的形式呈现,在图8中,(g)部分是拍摄的某个场景的图像,(h)部分是通过现有技术中的深度学习网络重建该图像后的效果,(i)部分是通过本申请实施例中的深度学习网络重建该图像后的效果。可以看出,本申请实施例对于物体表面的纹理表达更多更清晰,能够更好地反映景物层次。As shown in Figure 8, Figure 8 illustrates the prediction effect of the deep learning network trained in the above prior art, and the comparison of the prediction effect of the deep learning network trained in the application, this result is presented in the form of an image, In FIG. 8 , part (g) is an image of a scene taken, part (h) is the effect of reconstructing the image through a deep learning network in the prior art, and part (i) is an image obtained by using an embodiment of the present application The effect after the deep learning network reconstructs the image. It can be seen that the embodiments of the present application express more and more clearly the texture of the surface of the object, and can better reflect the level of the scene.
在图4所示的方法中,训练深度学习网络(用于预测三维光场)所用到的voxel采样点,是基于图像的深度信息获取的,具体是基于深度信息和voxel到相机的距离计算出一个截断距离,然后根据截断距离的大小进行差异化采样,一方面,这种采样方式能够快速将采样集中到重点区域,提高了采样效率;另一方面,这种采样方式采样的voxel基本都集中在物体表面附近,因此后续基于这种voxel训练出的深度学习网络在进行图像预测时,能够更好的表征物体的纹理细节信息,能够减少模糊(blur)和结构错误的现象。In the method shown in Figure 4, the voxel sampling points used for training the deep learning network (for predicting the 3D light field) are obtained based on the depth information of the image, specifically calculated based on the depth information and the distance from the voxel to the camera A truncation distance, and then perform differential sampling according to the size of the truncation distance. On the one hand, this sampling method can quickly concentrate the sampling to key areas and improve the sampling efficiency; on the other hand, the voxels sampled by this sampling method are basically concentrated. In the vicinity of the surface of the object, the deep learning network trained based on this voxel can better represent the texture details of the object when performing image prediction, and can reduce the phenomenon of blur and structural errors.
上述详细阐述了本申请实施例的方法,下面提供了本申请实施例的装置。The methods of the embodiments of the present application are described in detail above, and the apparatuses of the embodiments of the present application are provided below.
请参见图9,图9是本申请实施例提供的一种光场预测模型的生成装置90的结构示意图,该装置可90以包括建立单元901、第一计算单元902、采样单元903和第二计算单元904,其中,各个单元的详细描述如下。Please refer to FIG. 9 . FIG. 9 is a schematic structural diagram of an apparatus 90 for generating a light field prediction model provided by an embodiment of the present application. The apparatus 90 may include a establishing unit 901 , a first calculating unit 902 , a sampling unit 903 and a second The calculation unit 904, wherein the detailed description of each unit is as follows.
建立单元901,用于根据多张样本图像各自的拍摄方位建立包围被拍摄场景的立方体模型,其中,所述立方体模型包括多个小立方体voxel;A establishing unit 901, configured to establish a cube model surrounding the shot scene according to the respective shooting orientations of the multiple sample images, wherein the cube model includes a plurality of small cube voxels;
第一计算单元902,用于根据所述多张样本图像分别计算所述多个小立方体中每个小立方体的多个截断距离,其中,根据第一样本图像计算得到的每个小立方体的一个截断距离包括:根据拍摄第一样本图像时相机到所述每个小立方体的距离与所述相机到所述场景中物体表面的距离确定所述截断距离,所述第一样本图像为所述多张样本图像中任意一张;The first calculation unit 902 is configured to calculate a plurality of cutoff distances of each small cube in the plurality of small cubes according to the plurality of sample images, wherein the distance of each small cube calculated according to the first sample image is calculated. A cut-off distance includes: determining the cut-off distance according to the distance from the camera to each small cube and the distance from the camera to the surface of the object in the scene when the first sample image is captured, and the first sample image is: any one of the multiple sample images;
采样单元903,用于根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,其中,采样的每个空间点对应有空间坐标;Sampling unit 903, configured to sample spatial points from the small cubes according to a plurality of cutoff distances of each small cube, wherein each sampled spatial point corresponds to a spatial coordinate;
第二计算单元904,用于根据采样的所述空间点的空间坐标训练光场预测模型,其中,所述光场预测模型用于预测所述场景的光场。The second computing unit 904 is configured to train a light field prediction model according to the sampled spatial coordinates of the spatial point, wherein the light field prediction model is used to predict the light field of the scene.
在上述装置中,训练深度学习网络(也称光场预测模型,用于预测三维光场)所用到的voxel采样点,是基于图像的深度信息获取的,具体是基于深度信息和voxel到相机的距离计算出一个截断距离,然后根据截断距离的大小进行差异化采样,一方面,这种采样方式能够快速将采样集中到重点区域,提高了采样效率;另一方面,这种采样方式采样的voxel基本都集中在物体表面附近,因此后续基于这种voxel训练出的深度学习网络在进行图像预测时,能够更好的表征物体的纹理细节信息,能够减少模糊(blur)和结构错误的现象。In the above device, the voxel sampling points used for training the deep learning network (also called the light field prediction model, which is used to predict the three-dimensional light field) are obtained based on the depth information of the image, specifically based on the depth information and the voxel to the camera. A truncation distance is calculated from the distance, and then differentiated sampling is performed according to the size of the truncation distance. On the one hand, this sampling method can quickly concentrate the sampling to key areas and improve the sampling efficiency; on the other hand, the voxel sampled by this sampling method Basically, they are concentrated near the surface of the object. Therefore, the deep learning network trained based on this voxel can better represent the texture details of the object when performing image prediction, and can reduce the phenomenon of blur and structural errors.
在一种可选的方案中,所述装置90还包括:In an optional solution, the device 90 further includes:
预测单元,用于通过所述光场预测模型预测所述场景的光场。也即是说,模型训练设备在训练出光场预存模型后,还会通过该光场预测模型预测光场。A prediction unit, configured to predict the light field of the scene by using the light field prediction model. That is to say, after training the pre-stored light field model, the model training device will also predict the light field through the light field prediction model.
在又一种可选的方案中,在根据所述每个小立方体的多个截断距离从所述小立方体中 采样空间点方面,所述采样单元903具体用于;对所述每个小立方体的所述多个截断距离进行融合计算,得到所述每个小立方体融合截断距离;根据所述每个小立方体的融合截断距离从所述每个小立方体中采样空间点。该实现方式中,对每个小立方体均要计算其融合截断距离,然后根据融合截断距离进行采样。In yet another optional solution, in terms of sampling spatial points from the small cubes according to a plurality of cutoff distances of each small cube, the sampling unit 903 is specifically configured to: for each small cube Perform fusion calculation on the plurality of truncation distances to obtain the fusion truncation distance of each small cube; and sample spatial points from each small cube according to the fusion truncation distance of each small cube. In this implementation, the fusion cutoff distance is calculated for each small cube, and then sampling is performed according to the fusion cutoff distance.
在又一种可选的方案中,在根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点方面,所述采样单元903具体用于:确定所述每个小立方体中至少有一个截断距离的绝对值小于预设阈值的第一小立方体;对所述第一小立方体的所述多个截断距离进行融合计算,得到所述第一小立方体融合截断距离;根据所述第一小立方体的融合截断距离从所述第一小立方体中采样空间点。在这种方式中,并非针对所有的小立方体均计算其融合截断距离,而是仅针对截断距离的绝对值小于预设阈值的小立方体计算其融合截断距离,因为截断距离较大时,相应小立方体距离物体表面越远,后续对其采样的必要性较小,因此,本申请不对这类小立方体的截断距离进行融合计算,相当于提前将这部分小立方体排除在采样范畴之外,在基本不降低后续采样效果的情况下,降低了计算量,提高了光场预测模型的生成效率。In yet another optional solution, in terms of sampling spatial points from the small cubes according to multiple cutoff distances of the small cubes, the sampling unit 903 is specifically configured to: determine each small cube There is at least a first small cube whose absolute value of truncation distance is less than a preset threshold; the fusion calculation is performed on the plurality of truncation distances of the first small cube to obtain the first small cube fusion truncation distance; The fusion cutoff distance of the first small cube samples spatial points from the first small cube. In this way, the fusion cut-off distance is not calculated for all small cubes, but only for the small cubes whose absolute value of cut-off distance is less than the preset threshold, because when the cut-off distance is large, the corresponding The farther the cube is from the surface of the object, the less necessary for subsequent sampling. Therefore, this application does not perform fusion calculation on the truncation distance of such small cubes, which is equivalent to excluding these small cubes from the sampling category in advance. Without reducing the effect of subsequent sampling, the amount of calculation is reduced, and the generation efficiency of the light field prediction model is improved.
在又一种可选的方案中,在从所述小立方体中采样空间点的过程中,融合截断距离越小的小立方体采样的空间点越多。可以理解,融合截断距离越小的立方体距离物体表面越近,这类小立方体中的空间点相比于同camera ray上的其他小立方体中的空间点来说,能够更好地体现像素信息,因此,更多地基于这类小立方体中的空间点进行光场预测模型的训练,有利于该光场预测模型后续预测出更准确的图像。In yet another optional solution, in the process of sampling spatial points from the small cube, the smaller the fusion cut-off distance is, the more spatial points are sampled from the small cube. It can be understood that the smaller the fusion truncation distance is, the closer the cube is to the surface of the object. Compared with the spatial points in other small cubes on the same camera ray, the spatial points in such small cubes can better reflect the pixel information. Therefore, training the light field prediction model based more on the spatial points in such small cubes is beneficial to the subsequent prediction of more accurate images by the light field prediction model.
在又一种可选的方案中,所述融合计算包括加权平均计算。可以理解,通过加权平均计算得到的融合截断距离能更准确的反映小立方体到物体表面的远近。In yet another optional solution, the fusion calculation includes weighted average calculation. It can be understood that the fusion cutoff distance calculated by the weighted average can more accurately reflect the distance between the small cube and the surface of the object.
在又一种可选的方案中,基于所述第一样本图像计算得到的第二小立方体的截断距离在加权平均计算时所占的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,所述第一夹角为所述第二小立方体所在的视场线camera ray与距离所述第二小立方体最近的物体表面的法向量之间的夹角,所述第二小立方体为所述立方体模型中的任意一个小立方体。In yet another optional solution, the weight value of the cut-off distance of the second small cube calculated based on the first sample image in the weighted average calculation is the same as In the first sample image, the distance of the camera is negatively correlated, and/or, it is positively correlated with the first included angle, and the first included angle is the distance between the field of view camera ray where the second small cube is located. The included angle between the normal vectors of the nearest object surface of the second small cube, where the second small cube is any small cube in the cube model.
在这种方式中,计算融合截断距离时用到了基于每张样本图像计算得到的权重值,由于该权重值与到时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,因此当把该权重值结合进来计算融合截断距离时,能够更准确地反映不同方位对该融合截断距离的影响。In this way, the weight value calculated based on each sample image is used to calculate the fusion cutoff distance, because the weight value is negatively correlated with the distance to the camera, and/or is positive with the first included angle Therefore, when the weight value is combined to calculate the fusion cut-off distance, the influence of different orientations on the fusion cut-off distance can be more accurately reflected.
在又一种可选的方案中,所述装置90还包括:In yet another optional solution, the device 90 further includes:
采集单元,用于在拍摄所述多张样本图像中每张样本图像时,从所述每张样本图像的拍摄视角采集深度信息,其中,所述深度信息用于表征所述相机到所述场景中物体表面的距离。an acquisition unit, configured to acquire depth information from a shooting angle of view of each sample image when shooting each sample image in the plurality of sample images, wherein the depth information is used to represent the distance from the camera to the scene distance from the surface of the object.
在这种方式中,从所述每张样本图像的拍摄视角采集深度信息,能够更准确地反映相机到被拍摄场景中的物体表面的距离,这有利于计算出更准确的截断距离。In this way, collecting depth information from the shooting angle of each sample image can more accurately reflect the distance from the camera to the surface of the object in the scene being shot, which is conducive to calculating a more accurate cutoff distance.
在又一种可选的方案中,在计算第二小立方体的融合截断距离的过程中,基于所述第一样本图像计算得到的截断距离在加权平均计算时所占的权重值,也可以称为根据所述第一样本图像计算得到的所述第二小立方体的权重值w(p),该权重值w(p)满足如下关系:In yet another optional solution, in the process of calculating the fusion cut-off distance of the second small cube, the weight value of the cut-off distance calculated based on the first sample image in the weighted average calculation may also be It is called the weight value w(p) of the second small cube calculated according to the first sample image, and the weight value w(p) satisfies the following relationship:
w(p)=cos(θ)/distance(v)w(p)=cos(θ)/distance(v)
其中,θ为所述第一夹角,distance(v)为所述第二小立方体到拍摄所述第一样本图像时所述相机的距离。Wherein, θ is the first included angle, and distance(v) is the distance from the second small cube to the camera when the first sample image is captured.
前面提到,根据所述第一样本图像计算得到的第二小立方体的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,这里的w(p)的表达式就是这一思想的一种可选表达,因此当把该权重值结合进来计算融合截断距离时,能够更准确地反映不同方位对该融合截断距离的影响。As mentioned above, the weight value of the second small cube calculated according to the first sample image is negatively correlated with the distance from the second small cube to the camera when the first sample image was captured, and / or, it is positively correlated with the first included angle, the expression of w(p) here is an optional expression of this idea, so when the weight value is combined to calculate the fusion cutoff distance, it can reflect more accurately The effect of different orientations on the fusion cutoff distance.
在又一种可选的方案中,所述根据第一样本图像计算得到的所述第二小立方体的截断距离d(p)满足如下关系:In yet another optional solution, the cut-off distance d(p) of the second small cube calculated according to the first sample image satisfies the following relationship:
d(p)=sdf(p)/|u|d(p)=sdf(p)/|u|
其中,sdf(p)为拍摄第一样本图像时相机到所述第一小立方体的距离,与所述相机到所述场景中物体表面的距离的差,u为预设阈值。Wherein, sdf(p) is the difference between the distance from the camera to the first small cube when the first sample image is captured, and the distance from the camera to the surface of the object in the scene, and u is a preset threshold.
可以理解,这里的d(p)仅为截断距离的一种可选的计算公式,实际应用中还可以有其他的表达方式。It can be understood that d(p) here is only an optional calculation formula for the cut-off distance, and other expressions may also be used in practical applications.
在又一种可选的方案中,若sdf(p)>|u|,则d(p)=1;若sdf(p)<0且|sdf(p)|>|u|,则d(p)=-1。In yet another optional solution, if sdf(p)>|u|, then d(p)=1; if sdf(p)<0 and |sdf(p)|>|u|, then d( p)=-1.
在这种方式中,将一范围内的小立方体的截断距离赋值为1,将另一范围内的小立方体的截断距离赋值为-1,这有利于后续对这两类小立方体做相同处理,从而提高计算效率。In this way, the truncation distance of the small cubes in one range is assigned as 1, and the truncation distance of the small cubes in another range is assigned as -1, which is conducive to the subsequent processing of the two types of small cubes. Thereby improving computational efficiency.
需要说明的是,各个单元的实现还可以对应参照图4所示的方法实施例的相应描述。It should be noted that, the implementation of each unit may also correspond to the corresponding description with reference to the method embodiment shown in FIG. 4 .
请参见图10,图10是本申请实施例提供的一种光场预测模型的生成设备100,该设备100包括处理器1001、存储器1002和通信接口1003,所述处理器1001、存储器1002和通信接口1003通过总线相互连接。Please refer to FIG. 10. FIG. 10 is a device 100 for generating a light field prediction model provided by an embodiment of the present application. The device 100 includes a processor 1001, a memory 1002, and a communication interface 1003. The processor 1001, the memory 1002, and the communication interface 1003. The interfaces 1003 are connected to each other through a bus.
存储器1002包括但不限于是随机存储记忆体(random access memory,RAM)、只读存储器(read-only memory,ROM)、可擦除可编程只读存储器(erasable programmable read only memory,EPROM)、或便携式只读存储器(compact disc read-only memory,CD-ROM),该存储器1002用于相关指令及数据。通信接口1003用于接收和发送数据。The memory 1002 includes, but is not limited to, random access memory (RAM), read-only memory (ROM), erasable programmable read only memory (EPROM), or A portable read-only memory (compact disc read-only memory, CD-ROM), the memory 1002 is used for related instructions and data. The communication interface 1003 is used to receive and transmit data.
处理器1001可以是一个或多个中央处理器(central processing unit,CPU),在处理器1001是一个CPU的情况下,该CPU可以是单核CPU,也可以是多核CPU。The processor 1001 may be one or more central processing units (central processing units, CPUs). In the case where the processor 1001 is a CPU, the CPU may be a single-core CPU or a multi-core CPU.
由于该设备100训练光场预测模型时需要用到多张样本图像,因此该设备100需要获取该多张样本图像,获取的方式可以为通过该通信接口1003接收其他设备发送的样本图像,或者该设备100上配置有相机(或者称为图像传感器、拍摄装置),该相机能够拍摄得到该多张样本图像。可选的,当该设备100上配置有相机时,该设备100上还可以配置深度传感器,用于采集被拍摄场景中的景物的深度,该深度传感器的类型此处不做限定。Since the device 100 needs to use multiple sample images when training the light field prediction model, the device 100 needs to acquire the multiple sample images. The device 100 is provided with a camera (or referred to as an image sensor, a photographing device), and the camera can photograph and obtain the plurality of sample images. Optionally, when the device 100 is configured with a camera, a depth sensor may also be configured on the device 100 to collect the depth of the scene in the scene being shot, and the type of the depth sensor is not limited here.
该设备100中的处理器1001用于读取所述存储器1002中存储的程序代码,执行以下操作:根据多张样本图像各自的拍摄方位建立包围被拍摄场景的立方体模型,其中,所述立方体模型包括多个小立方体voxel;然后,根据所述多张样本图像分别计算所述多个小立方体中每个小立方体的多个截断距离,其中,根据第一样本图像计算得到的每个小立方体的一个截断距离包括:根据拍摄第一样本图像时相机到所述每个小立方体的距离与所述相 机到所述场景中物体表面的距离确定所述截断距离,所述第一样本图像为所述多张样本图像中任意一张;之后根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,其中,采样的每个空间点对应有空间坐标;然后,根据采样的所述空间点的空间坐标训练光场预测模型,其中,所述光场预测模型用于预测所述场景的光场。The processor 1001 in the device 100 is configured to read the program code stored in the memory 1002, and perform the following operations: establish a cube model surrounding the shot scene according to the respective shooting orientations of the multiple sample images, wherein the cube model including a plurality of small cube voxels; then, according to the plurality of sample images, respectively calculate a plurality of cut-off distances of each small cube in the plurality of small cubes, wherein each small cube calculated according to the first sample image A cut-off distance of the is any one of the multiple sample images; after that, a spatial point is sampled from the small cube according to a plurality of cutoff distances of each small cube, wherein each sampled spatial point corresponds to a spatial coordinate; then, A light field prediction model is trained according to the sampled spatial coordinates of the spatial points, wherein the light field prediction model is used to predict the light field of the scene.
在上述方法中,训练深度学习网络(也称光场预测模型,用于预测三维光场)所用到的voxel采样点,是基于图像的深度信息获取的,具体是基于深度信息和voxel到相机的距离计算出一个截断距离,然后根据截断距离的大小进行差异化采样,一方面,这种采样方式能够快速将采样集中到重点区域,提高了采样效率;另一方面,这种采样方式采样的voxel基本都集中在物体表面附近,因此后续基于这种voxel训练出的深度学习网络在进行图像预测时,能够更好的表征物体的纹理细节信息,能够减少模糊(blur)和结构错误的现象。In the above method, the voxel sampling points used for training the deep learning network (also known as the light field prediction model, which is used to predict the three-dimensional light field) are obtained based on the depth information of the image, specifically based on the depth information and the voxel to the camera. A truncation distance is calculated from the distance, and then differentiated sampling is performed according to the size of the truncation distance. On the one hand, this sampling method can quickly concentrate the sampling to key areas and improve the sampling efficiency; on the other hand, the voxel sampled by this sampling method Basically, they are concentrated near the surface of the object. Therefore, the deep learning network trained based on this voxel can better represent the texture details of the object when performing image prediction, and can reduce the phenomenon of blur and structural errors.
在一种可选的方案中,所述根据采样的所述空间点的空间坐标训练光场预测模型之后,所述处理器1001具体用于:通过所述光场预测模型预测所述场景的光场。也即是说,模型训练设备在训练出光场预存模型后,还会通过该光场预测模型预测光场。In an optional solution, after the light field prediction model is trained according to the sampled spatial coordinates of the spatial points, the processor 1001 is specifically configured to: predict the light of the scene by using the light field prediction model field. That is to say, after training the pre-stored light field model, the model training device will also predict the light field through the light field prediction model.
在又一种可选的方案中,在根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点方面,所述处理器具体用于:对所述每个小立方体的所述多个截断距离进行融合计算,得到所述每个小立方体融合截断距离;根据所述每个小立方体的融合截断距离从所述每个小立方体中采样空间点。该实现方式中,对每个小立方体均要计算其融合截断距离,然后根据融合截断距离进行采样。In yet another optional solution, in terms of sampling spatial points from the small cubes according to a plurality of cutoff distances of the small cubes, the processor is specifically configured to: A fusion calculation is performed on the plurality of cutoff distances to obtain the fusion cutoff distance of each small cube; a spatial point is sampled from each small cube according to the fusion cutoff distance of each small cube. In this implementation, the fusion cutoff distance is calculated for each small cube, and then sampling is performed according to the fusion cutoff distance.
在又一种可选的方案中,在根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点方面,所述处理器具体用于:确定所述每个小立方体中至少有一个截断距离的绝对值小于预设阈值的第一小立方体;对所述第一小立方体的所述多个截断距离进行融合计算,得到所述第一小立方体融合截断距离;根据所述第一小立方体的融合截断距离从所述第一小立方体中采样空间点。在这种方式中,并非针对所有的小立方体均计算其融合截断距离,而是仅针对截断距离的绝对值小于预设阈值的小立方体计算其融合截断距离,因为截断距离较大时,相应小立方体距离物体表面越远,后续对其采样的必要性较小,因此,本申请不对这类小立方体的截断距离进行融合计算,相当于提前将这部分小立方体排除在采样范畴之外,在基本不降低后续采样效果的情况下,降低了计算量,提高了光场预测模型的生成效率。In yet another optional solution, in terms of sampling spatial points from the small cubes according to a plurality of cutoff distances of each small cube, the processor is specifically configured to: determine the There is at least one first small cube whose absolute value of the truncation distance is smaller than a preset threshold; the fusion calculation is performed on the plurality of truncation distances of the first small cube to obtain the first small cube fusion truncation distance; according to the The fusion cutoff distance of the first small cube samples spatial points from the first small cube. In this way, the fusion cut-off distance is not calculated for all small cubes, but only for the small cubes whose absolute value of cut-off distance is less than the preset threshold, because when the cut-off distance is large, the corresponding The farther the cube is from the surface of the object, the less necessary for subsequent sampling. Therefore, this application does not perform fusion calculation on the truncation distance of such small cubes, which is equivalent to excluding these small cubes from the sampling category in advance. Without reducing the effect of subsequent sampling, the amount of calculation is reduced, and the generation efficiency of the light field prediction model is improved.
在又一种可选的方案中,在从所述小立方体中采样空间点的过程中,融合截断距离越小的小立方体采样的空间点越多。可以理解,融合截断距离越小的立方体距离物体表面越近,这类小立方体中的空间点相比于同camera ray上的其他小立方体中的空间点来说,能够更好地体现像素信息,因此,更多地基于这类小立方体中的空间点进行光场预测模型的训练,有利于该光场预测模型后续预测出更准确的图像。In yet another optional solution, in the process of sampling spatial points from the small cube, the smaller the fusion cut-off distance is, the more spatial points are sampled from the small cube. It can be understood that the smaller the fusion truncation distance is, the closer the cube is to the surface of the object. Compared with the spatial points in other small cubes on the same camera ray, the spatial points in such small cubes can better reflect the pixel information. Therefore, training the light field prediction model based more on the spatial points in such small cubes is beneficial to the subsequent prediction of more accurate images by the light field prediction model.
在又一种可选的方案中,所述融合计算包括加权平均计算。可以理解,通过加权平均计算得到的融合截断距离能更准确的反映小立方体到物体表面的远近。In yet another optional solution, the fusion calculation includes weighted average calculation. It can be understood that the fusion cutoff distance calculated by the weighted average can more accurately reflect the distance between the small cube and the surface of the object.
在又一种可选的方案中,基于所述第一样本图像计算得到的第二小立方体的截断距离在加权平均计算时所占的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,所述第一夹角为所述第二小立方体所在 的视场线camera ray与距离所述第二小立方体最近的物体表面的法向量之间的夹角,所述第二小立方体为所述立方体模型中的任意一个小立方体。In yet another optional solution, the weight value of the cut-off distance of the second small cube calculated based on the first sample image in the weighted average calculation is the same as In the first sample image, the distance of the camera is negatively correlated, and/or, it is positively correlated with the first included angle, and the first included angle is the distance between the field of view camera ray where the second small cube is located. The included angle between the normal vectors of the nearest object surface of the second small cube, where the second small cube is any small cube in the cube model.
在这种方式中,计算融合截断距离时用到了基于每张样本图像计算得到的权重值,由于该权重值与到时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,因此当把该权重值结合进来计算融合截断距离时,能够更准确地反映不同方位对该融合截断距离的影响。In this way, the weight value calculated based on each sample image is used to calculate the fusion cutoff distance, because the weight value is negatively correlated with the distance to the camera, and/or is positive with the first included angle Therefore, when the weight value is combined to calculate the fusion cut-off distance, the influence of different orientations on the fusion cut-off distance can be more accurately reflected.
在又一种可选的方案中,在根据所述多张样本图像分别计算所述立方体模型中每个小立方体的截断距离之前,所述处理器还用于:在拍摄所述多张样本图像中每张样本图像时,从所述每张样本图像的拍摄视角采集深度信息,其中,该深度信息用于表征所述相机到所述场景中物体表面的距离。In yet another optional solution, before calculating the cutoff distance of each small cube in the cube model according to the multiple sample images, the processor is further configured to: take the multiple sample images For each sample image in the scene, depth information is collected from the shooting angle of view of each sample image, where the depth information is used to represent the distance from the camera to the surface of the object in the scene.
在这种方式中,从所述每张样本图像的拍摄视角采集深度信息,能够更准确地反映相机到被拍摄场景中的物体表面的距离,这有利于计算出更准确的截断距离。In this way, collecting depth information from the shooting angle of each sample image can more accurately reflect the distance from the camera to the surface of the object in the scene being shot, which is conducive to calculating a more accurate cutoff distance.
在又一种可选的方案中,在计算第二小立方体的融合截断距离的过程中,基于所述第一样本图像计算得到的截断距离在加权平均计算时所占的权重值,也可以称为根据所述第一样本图像计算得到的所述第二小立方体的权重值w(p),该权重值w(p)满足如下关系:In yet another optional solution, in the process of calculating the fusion cut-off distance of the second small cube, the weight value of the cut-off distance calculated based on the first sample image in the weighted average calculation may also be It is called the weight value w(p) of the second small cube calculated according to the first sample image, and the weight value w(p) satisfies the following relationship:
w(p)=cos(θ)/distance(v)w(p)=cos(θ)/distance(v)
其中,θ为所述第一夹角,distance(v)为所述第二小立方体到拍摄所述第一样本图像时所述相机的距离。Wherein, θ is the first included angle, and distance(v) is the distance from the second small cube to the camera when the first sample image is captured.
前面提到,根据所述第一样本图像计算得到的第二小立方体的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,这里的w(p)的表达式就是这一思想的一种可选表达,因此当把该权重值结合进来计算融合截断距离时,能够更准确地反映不同方位对该融合截断距离的影响。As mentioned above, the weight value of the second small cube calculated according to the first sample image is negatively correlated with the distance from the second small cube to the camera when the first sample image was captured, and / or, it is positively correlated with the first included angle, the expression of w(p) here is an optional expression of this idea, so when the weight value is combined to calculate the fusion cutoff distance, it can reflect more accurately The effect of different orientations on the fusion cutoff distance.
在又一种可选的方案中,所述根据第一样本图像计算得到的所述第二小立方体的截断距离d(p)满足如下关系:In yet another optional solution, the cut-off distance d(p) of the second small cube calculated according to the first sample image satisfies the following relationship:
d(p)=sdf(p)/|u|d(p)=sdf(p)/|u|
其中,sdf(p)为拍摄第一样本图像时相机到所述第一小立方体的距离,与所述相机到所述场景中物体表面的距离的差,u为预设阈值。Wherein, sdf(p) is the difference between the distance from the camera to the first small cube when the first sample image is captured, and the distance from the camera to the surface of the object in the scene, and u is a preset threshold.
可以理解,这里的d(p)仅为截断距离的一种可选的计算公式,实际应用中还可以有其他的表达方式。It can be understood that d(p) here is only an optional calculation formula for the cut-off distance, and other expressions may also be used in practical applications.
结合第一方面或者第一方面的上述任一种可能的实现方式,在第一方面的又一种可选的方案中,若sdf(p)>|u|,则d(p)=1;若sdf(p)<0且|sdf(p)|>|u|,则d(p)=-1。With reference to the first aspect or any of the above possible implementation manners of the first aspect, in yet another optional solution of the first aspect, if sdf(p)>|u|, then d(p)=1; If sdf(p)<0 and |sdf(p)|>|u|, then d(p)=-1.
在这种方式中,将一范围内的小立方体的截断距离赋值为1,将另一范围内的小立方体的截断距离赋值为-1,这有利于后续对这两类小立方体做相同处理,从而提高计算效率。In this way, the truncation distance of the small cubes in one range is assigned as 1, and the truncation distance of the small cubes in another range is assigned as -1, which is conducive to the subsequent processing of the two types of small cubes. Thereby improving computational efficiency.
需要说明的是,各个操作的实现还可以对应参照图4所示的方法实施例的相应描述。It should be noted that, the implementation of each operation may also correspond to the corresponding description with reference to the method embodiment shown in FIG. 4 .
本申请实施例还提供一种芯片系统,所述芯片系统包括至少一个处理器,存储器和接口电路,所述存储器、所述收发器和所述至少一个处理器通过线路互联,所述至少一个存储器中存储有指令;所述指令被所述处理器执行时,实现图4所示的方法流程。An embodiment of the present application further provides a chip system, the chip system includes at least one processor, a memory, and an interface circuit, the memory, the transceiver, and the at least one processor are interconnected by lines, and the at least one memory Instructions are stored in the ; when the instructions are executed by the processor, the method flow shown in FIG. 4 is implemented.
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有指 令,当其在处理器上运行时,实现图4所示的方法流程。Embodiments of the present application further provide a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium runs on the processor, the method flow shown in FIG. 4 is implemented.
本申请实施例还提供一种计算机程序产品,当所述计算机程序产品在处理器上运行时,实现图4所示的方法流程。The embodiment of the present application further provides a computer program product, which implements the method flow shown in FIG. 4 when the computer program product runs on the processor.
综上所述,通过实施本申请实施例,训练深度学习网络(也称光场预测模型,用于预测三维光场)所用到的voxel采样点,是基于图像的深度信息获取的,具体是基于深度信息和voxel到相机的距离计算出一个截断距离,然后根据截断距离的大小进行差异化采样,一方面,这种采样方式能够快速将采样集中到重点区域,提高了采样效率;另一方面,这种采样方式采样的voxel基本都集中在物体表面附近,因此后续基于这种voxel训练出的深度学习网络在进行图像预测时,能够更好的表征物体的纹理细节信息,能够减少模糊(blur)和结构错误的现象。To sum up, by implementing the embodiments of the present application, the voxel sampling points used for training a deep learning network (also called a light field prediction model, which is used to predict a three-dimensional light field) are obtained based on the depth information of the image, specifically based on the depth information of the image. A truncation distance is calculated from the depth information and the distance from the voxel to the camera, and then differentiated sampling is performed according to the size of the truncation distance. The voxels sampled by this sampling method are basically concentrated near the surface of the object, so the deep learning network trained based on this voxel can better represent the texture details of the object when performing image prediction, and can reduce blur (blur) and structural errors.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented. The process can be completed by instructing the relevant hardware by a computer program, and the program can be stored in a computer-readable storage medium. When the program is executed , which may include the processes of the foregoing method embodiments. The aforementioned storage medium includes: ROM or random storage memory RAM, magnetic disk or optical disk and other mediums that can store program codes.

Claims (16)

  1. 一种光场预测模型的生成方法,其特征在于,包括:A method for generating a light field prediction model, comprising:
    根据多张样本图像各自的拍摄方位建立包围被拍摄场景的立方体模型,其中,所述立方体模型包括多个小立方体voxel;Create a cube model surrounding the shot scene according to the respective shooting orientations of the multiple sample images, wherein the cube model includes a plurality of small cube voxels;
    根据所述多张样本图像分别计算所述多个小立方体中每个小立方体的多个截断距离,其中,根据第一样本图像计算得到的每个小立方体的一个截断距离包括:根据拍摄所述第一样本图像时相机到所述每个小立方体的距离与所述相机到所述场景中物体表面的距离确定所述截断距离,所述第一样本图像为所述多张样本图像中任意一张;According to the plurality of sample images, a plurality of cut-off distances for each small cube in the plurality of small cubes are respectively calculated, wherein one cut-off distance for each small cube calculated according to the first sample image includes: according to the photographing location In the first sample image, the distance from the camera to each small cube and the distance from the camera to the surface of the object in the scene determine the cut-off distance, and the first sample image is the multiple sample images any one of them;
    根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,其中,采样的每个空间点对应有空间坐标;Sample spatial points from the small cubes according to a plurality of cutoff distances of each small cube, wherein each sampled spatial point corresponds to a spatial coordinate;
    根据采样的所述空间点的空间坐标训练光场预测模型,其中,所述光场预测模型用于预测所述场景的光场。A light field prediction model is trained according to the sampled spatial coordinates of the spatial points, wherein the light field prediction model is used to predict the light field of the scene.
  2. 根据权利要求1所述的方法,其特征在于,所述根据采样的所述空间点的空间坐标训练光场预测模型之后,还包括:The method according to claim 1, wherein after training the light field prediction model according to the sampled spatial coordinates of the spatial points, the method further comprises:
    通过所述光场预测模型预测所述场景的光场。The light field of the scene is predicted by the light field prediction model.
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,包括:The method according to claim 1 or 2, wherein the sampling of spatial points from the small cubes according to a plurality of cutoff distances of each small cube comprises:
    对所述每个小立方体的所述多个截断距离进行融合计算,得到所述每个小立方体融合截断距离;Perform fusion calculation on the plurality of cutoff distances of each small cube to obtain the fusion cutoff distance of each small cube;
    根据所述每个小立方体的融合截断距离从所述每个小立方体中采样空间点。Spatial points are sampled from each small cube according to the fusion cutoff distance of each small cube.
  4. 根据权利要求1或2所述的方法,其特征在于,所述根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,包括:The method according to claim 1 or 2, wherein the sampling of spatial points from the small cubes according to a plurality of cutoff distances of each small cube comprises:
    确定所述每个小立方体中至少有一个截断距离的绝对值小于预设阈值的第一小立方体;determining that there is at least one first small cube whose absolute value of the truncation distance is less than a preset threshold in each of the small cubes;
    对所述第一小立方体的所述多个截断距离进行融合计算,得到所述第一小立方体的融合截断距离;performing fusion calculation on the plurality of cutoff distances of the first small cube to obtain the fusion cutoff distance of the first small cube;
    根据所述第一小立方体的融合截断距离从所述第一小立方体中采样空间点。Spatial points are sampled from the first small cube according to the fusion cutoff distance of the first small cube.
  5. 根据权利要求3或4所述的方法,其特征在于,在从所述小立方体中采样空间点的过程中,融合截断距离越小的小立方体采样的空间点越多。The method according to claim 3 or 4, characterized in that, in the process of sampling spatial points from the small cubes, the smaller the fusion cut-off distance is, the more spatial points are sampled from the small cubes.
  6. 根据权利要求3-5任一项所述的方法,其特征在于,所述融合计算包括加权平均计算。The method according to any one of claims 3-5, wherein the fusion calculation includes a weighted average calculation.
  7. 根据权利要求6所述的方法,其特征在于,基于所述第一样本图像计算得到的第二小立方体的截断距离在加权平均计算时所占的权重值,与所述第二小立方体到拍摄所述第 一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,所述第一夹角为所述第二小立方体所在的视场线camera ray与距离所述第二小立方体最近的物体表面的法向量之间的夹角,所述第二小立方体为所述立方体模型中的任意一个小立方体。The method according to claim 6, wherein the weight value of the cut-off distance of the second small cube calculated based on the first sample image in the weighted average calculation is the same as that of the second small cube to The distance of the camera when shooting the first sample image is negatively correlated, and/or positively correlated with a first included angle, the first included angle being the field of view camera ray where the second small cube is located The included angle with the normal vector of the surface of the object closest to the second small cube, where the second small cube is any small cube in the cube model.
  8. 一种光场预测模型的生成装置,其特征在于,包括:A device for generating a light field prediction model, comprising:
    建立单元,用于根据多张样本图像各自的拍摄方位建立包围被拍摄场景的立方体模型,其中,所述立方体模型包括多个小立方体voxel;a establishing unit, configured to establish a cube model surrounding the scene to be shot according to the respective shooting orientations of the multiple sample images, wherein the cube model includes a plurality of small cube voxels;
    第一计算单元,用于根据所述多张样本图像分别计算所述多个小立方体中每个小立方体的多个截断距离,其中,根据第一样本图像计算得到的每个小立方体的一个截断距离包括:根据拍摄所述第一样本图像时相机到所述每个小立方体的距离与所述相机到所述场景中物体表面的距离确定所述截断距离,所述第一样本图像为所述多张样本图像中任意一张;a first calculation unit, configured to calculate a plurality of cutoff distances of each small cube in the plurality of small cubes according to the plurality of sample images, wherein one of each small cube calculated according to the first sample image The cut-off distance includes: determining the cut-off distance according to the distance from the camera to each small cube and the distance from the camera to the surface of the object in the scene when the first sample image is captured, and the first sample image is any one of the multiple sample images;
    采样单元,用于根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点,其中,采样的每个空间点对应有空间坐标;a sampling unit, configured to sample spatial points from the small cubes according to a plurality of cutoff distances of each small cube, wherein each sampled spatial point corresponds to a spatial coordinate;
    第二计算单元,用于根据采样的所述空间点的空间坐标训练光场预测模型,其中,所述光场预测模型用于预测所述场景的光场。The second computing unit is configured to train a light field prediction model according to the sampled spatial coordinates of the spatial point, wherein the light field prediction model is used to predict the light field of the scene.
  9. 根据权利要求8所述的装置,其特征在于,所述装置还包括:The apparatus according to claim 8, wherein the apparatus further comprises:
    预测单元,用于通过所述光场预测模型预测所述场景的光场。A prediction unit, configured to predict the light field of the scene by using the light field prediction model.
  10. 根据权利要求8或9所述的装置,其特征在于,在根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点方面,所述采样单元具体用于:The device according to claim 8 or 9, wherein, in terms of sampling spatial points from the small cubes according to a plurality of cutoff distances of each small cube, the sampling unit is specifically used for:
    对所述每个小立方体的所述多个截断距离进行融合计算,得到所述每个小立方体融合截断距离;Perform fusion calculation on the plurality of cutoff distances of each small cube to obtain the fusion cutoff distance of each small cube;
    根据所述每个小立方体的融合截断距离从所述每个小立方体中采样空间点。Spatial points are sampled from each small cube according to the fusion cutoff distance of each small cube.
  11. 根据权利要求8或9所述的装置,其特征在于,在根据所述每个小立方体的多个截断距离从所述小立方体中采样空间点方面,所述采样单元具体用于:The device according to claim 8 or 9, wherein, in terms of sampling spatial points from the small cubes according to a plurality of cutoff distances of each small cube, the sampling unit is specifically used for:
    确定所述每个小立方体中至少有一个截断距离的绝对值小于预设阈值的第一小立方体;determining that there is at least one first small cube whose absolute value of the truncation distance is less than a preset threshold in each of the small cubes;
    对所述第一小立方体的所述多个截断距离进行融合计算,得到所述第一小立方体融合截断距离;Perform a fusion calculation on the plurality of truncation distances of the first small cube to obtain the fusion truncation distance of the first small cube;
    根据所述第一小立方体的融合截断距离从所述第一小立方体中采样空间点。Spatial points are sampled from the first small cube according to the fusion cutoff distance of the first small cube.
  12. 根据权利要求10或11所述的装置,其特征在于,在从所述小立方体中采样空间点的过程中,融合截断距离越小的小立方体采样的空间点越多。The apparatus according to claim 10 or 11, characterized in that, in the process of sampling spatial points from the small cube, the smaller the fusion cut-off distance is, the more spatial points are sampled from the small cube.
  13. 根据权利要求10-12任一项所述的装置,其特征在于,所述融合计算包括加权平均计算。The apparatus according to any one of claims 10-12, wherein the fusion calculation includes a weighted average calculation.
  14. 根据权利要求13所述的装置,其特征在于,基于所述第一样本图像计算得到的第二小立方体的截断距离在加权平均计算时所占的权重值,与所述第二小立方体到拍摄所述第一样本图像时所述相机的距离呈负相关,和/或,与第一夹角呈正相关,所述第一夹角为所述第二小立方体所在的视场线camera ray与距离所述第二小立方体最近的物体表面的法向量之间的夹角,所述第二小立方体为所述立方体模型中的任意一个小立方体。The device according to claim 13, wherein the weight value of the cut-off distance of the second small cube calculated based on the first sample image in the weighted average calculation is the same as that of the second small cube to The distance of the camera when shooting the first sample image is negatively correlated, and/or positively correlated with a first included angle, the first included angle being the field of view camera ray where the second small cube is located The included angle with the normal vector of the surface of the object closest to the second small cube, where the second small cube is any small cube in the cube model.
  15. 一种光场预测模型的生成设备,其特征在于,包括处理器和存储器,其中,所述存储器用于存储计算机程序,所述计算机程序在所述处理器上运行时实现权利要求1-7任一项所述的方法。A device for generating a light field prediction model, characterized by comprising a processor and a memory, wherein the memory is used to store a computer program, and the computer program implements any of claims 1-7 when running on the processor. one of the methods described.
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序,当所述计算机程序在处理器上运行时,实现权利要求1-7任一项所述的方法。A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, and when the computer program runs on a processor, the method of any one of claims 1-7 is implemented .
PCT/CN2021/080893 2021-03-15 2021-03-15 Method for generating light field prediction model, and related apparatus WO2022193104A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202180095331.6A CN117015966A (en) 2021-03-15 2021-03-15 Method and related device for generating light field prediction model
PCT/CN2021/080893 WO2022193104A1 (en) 2021-03-15 2021-03-15 Method for generating light field prediction model, and related apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/080893 WO2022193104A1 (en) 2021-03-15 2021-03-15 Method for generating light field prediction model, and related apparatus

Publications (1)

Publication Number Publication Date
WO2022193104A1 true WO2022193104A1 (en) 2022-09-22

Family

ID=83321601

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/080893 WO2022193104A1 (en) 2021-03-15 2021-03-15 Method for generating light field prediction model, and related apparatus

Country Status (2)

Country Link
CN (1) CN117015966A (en)
WO (1) WO2022193104A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101982741A (en) * 2010-09-08 2011-03-02 北京航空航天大学 Underwater light field sampling and simulating method
CN106165387A (en) * 2013-11-22 2016-11-23 维迪诺蒂有限公司 Light field processing method
CN107018293A (en) * 2015-09-17 2017-08-04 汤姆逊许可公司 The method and apparatus that generation represents the data of light field
US20190174115A1 (en) * 2015-09-17 2019-06-06 Thomson Licensing Light field data representation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101982741A (en) * 2010-09-08 2011-03-02 北京航空航天大学 Underwater light field sampling and simulating method
CN106165387A (en) * 2013-11-22 2016-11-23 维迪诺蒂有限公司 Light field processing method
CN107018293A (en) * 2015-09-17 2017-08-04 汤姆逊许可公司 The method and apparatus that generation represents the data of light field
US20190174115A1 (en) * 2015-09-17 2019-06-06 Thomson Licensing Light field data representation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
TANG SHENGZE, GUI TAO : "Model of 5 Light Field for Navigation in Occluded Environment", JOURNAL OF TSINGHUA UNIVERSITY (SCIENCE AND TECHNOLOGY), vol. 38, no. S1, 31 December 1998 (1998-12-31), pages 10 - 14, XP055968315, DOI: 10.16511/j.cnki.qhdxxb.1998.s1.003 *

Also Published As

Publication number Publication date
CN117015966A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN112894832B (en) Three-dimensional modeling method, three-dimensional modeling device, electronic equipment and storage medium
CN108986161B (en) Three-dimensional space coordinate estimation method, device, terminal and storage medium
CN106780590B (en) Method and system for acquiring depth map
US11557083B2 (en) Photography-based 3D modeling system and method, and automatic 3D modeling apparatus and method
CN110910437B (en) Depth prediction method for complex indoor scene
CN112862874B (en) Point cloud data matching method and device, electronic equipment and computer storage medium
CN112288853B (en) Three-dimensional reconstruction method, three-dimensional reconstruction device, and storage medium
WO2019238114A1 (en) Three-dimensional dynamic model reconstruction method, apparatus and device, and storage medium
WO2021093679A1 (en) Visual positioning method and device
US20170038212A1 (en) Automatic connection of images using visual features
WO2023015409A1 (en) Object pose detection method and apparatus, computer device, and storage medium
WO2022247548A1 (en) Positioning method, apparatus, electronic device, and storage medium
CN113129352A (en) Sparse light field reconstruction method and device
WO2021151380A1 (en) Method for rendering virtual object based on illumination estimation, method for training neural network, and related products
CN113902802A (en) Visual positioning method and related device, electronic equipment and storage medium
CN116071504B (en) Multi-view three-dimensional reconstruction method for high-resolution image
CN115578515B (en) Training method of three-dimensional reconstruction model, three-dimensional scene rendering method and device
WO2022193104A1 (en) Method for generating light field prediction model, and related apparatus
Chen et al. Densefusion: Large-scale online dense pointcloud and dsm mapping for uavs
Lyu et al. 3DOPFormer: 3D Occupancy Perception from Multi-Camera Images with Directional and Distance Enhancement
Mo et al. Cross-based dense depth estimation by fusing stereo vision with measured sparse depth
CN112288817A (en) Three-dimensional reconstruction processing method and device based on image
Kolhatkar et al. Real-time virtual viewpoint generation on the GPU for scene navigation
Hou et al. Depth estimation and object detection for monocular semantic SLAM using deep convolutional network
Liao et al. VI-NeRF-SLAM: a real-time visual–inertial SLAM with NeRF mapping

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21930700

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202180095331.6

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21930700

Country of ref document: EP

Kind code of ref document: A1