WO2024031251A1 - 在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统 - Google Patents

在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统 Download PDF

Info

Publication number
WO2024031251A1
WO2024031251A1 PCT/CN2022/110907 CN2022110907W WO2024031251A1 WO 2024031251 A1 WO2024031251 A1 WO 2024031251A1 CN 2022110907 W CN2022110907 W CN 2022110907W WO 2024031251 A1 WO2024031251 A1 WO 2024031251A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
nerf
voxel
dimensional
embedding
Prior art date
Application number
PCT/CN2022/110907
Other languages
English (en)
French (fr)
Inventor
张岩
Original Assignee
北京原创力科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京原创力科技有限公司 filed Critical 北京原创力科技有限公司
Priority to PCT/CN2022/110907 priority Critical patent/WO2024031251A1/zh
Publication of WO2024031251A1 publication Critical patent/WO2024031251A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/08Volume rendering

Definitions

  • the present invention relates to the technical fields of computer imaging and three-dimensional reconstruction, and in particular to a volume rendering method and system for embedding 2D/3D video in NeRF three-dimensional scene reconstruction.
  • NeRF neural radiance field
  • the trained NeRF model can calculate the voxel density value at any coordinate at any time in three-dimensional space (dynamic scene), as well as the color value in a certain ray traveling direction. Scenes and videos reconstructed using NeRF can support free-view viewing, giving users a more immersive experience.
  • the NeRF method can better reconstruct three-dimensional scenes, it requires harsh shooting conditions.
  • the current volumetric video production requires expensive camera array shooting equipment and a lot of post-production time.
  • the NeRF-based volumetric video production method can reduce the number of cameras. The number of arrays reduces post-production time, but volumetric video is difficult to promote on a large scale due to its high acquisition and production costs.
  • the present invention combines the scene expression method of NeRF to fully utilize the existing 2D/3D video resources and improve the richness of the volume video material library. It solves the existing problems of long production cycle and high cost of volume video. Moreover, the present invention can arbitrarily insert rich 2D/3D video resources, solving the problems of insufficient existing volume video content and insufficient artistic expression.
  • the present invention also proposes a volume rendering method for embedding 2D/3D video in NeRF three-dimensional scene reconstruction, which includes:
  • Step 1 Obtain the viewing angle parameters and the embedding position of the 2D/3D video, input them into the trained NeRF offline model, and obtain the NeRF three-dimensional space scene;
  • Step 2 Perform image frame voxelization on the 2D or 3D video stream to be embedded and then embed it into the NeRF three-dimensional space scene to obtain the video embedded three-dimensional space scene;
  • Step 3 Perform joint volume rendering on the video embedded in the three-dimensional space scene to obtain the three-dimensional video embedded in the 2D or 3D video stream under the viewing angle parameters.
  • step 1 includes obtaining the viewing angle parameters from a head-mounted VR display, or obtaining a binocular camera of the viewpoint through real-time human eye recognition and positioning parameter as the viewing angle parameter.
  • the volume rendering method of embedding 2D/3D video in NeRF three-dimensional scene reconstruction, wherein the voxelization of the picture frame in step 2 includes:
  • the picture frames of the video stream are voxelized.
  • the information saved in each voxel includes RGB three-channel color value and voxel opacity.
  • the color value refers to the original picture frame, and the voxel does not Transparency is the probability of light being absorbed after passing through a voxel.
  • step 3 includes:
  • the present invention also proposes a volume rendering system that embeds 2D/3D video in NeRF three-dimensional scene reconstruction, which includes:
  • the initial module is used to obtain the viewing angle parameters and the embedding position of the 2D/3D video, and input them into the trained NeRF offline model to obtain the NeRF three-dimensional space scene;
  • the embedding module is used to voxelize the picture frame of the 2D or 3D video stream to be embedded and then embed it into the NeRF three-dimensional space scene to obtain the video embedded three-dimensional space scene;
  • the rendering module is used for joint volume rendering of the video embedded in the three-dimensional space scene to obtain the three-dimensional video embedded in the 2D or 3D video stream under the viewing angle parameters.
  • the described volume rendering system for embedding 2D/3D video in NeRF three-dimensional scene reconstruction wherein the initial module is used to obtain the viewing angle parameters from a head-mounted VR display, or obtain the binocular view of the viewpoint through real-time human eye recognition and positioning
  • the camera parameters are used as the viewing angle parameters.
  • the volume rendering system for embedding 2D/3D video in NeRF three-dimensional scene reconstruction wherein the picture frame voxelization includes:
  • the picture frames of the video stream are voxelized.
  • the information saved in each voxel includes RGB three-channel color value and voxel opacity.
  • the color value refers to the original picture frame, and the voxel does not Transparency is the probability of light being absorbed after passing through a voxel.
  • the volume rendering system for embedding 2D/3D video in NeRF three-dimensional scene reconstruction wherein the rendering module includes:
  • the present invention also proposes a storage medium for storing a program for executing any one of the volume rendering methods of embedding 2D/3D video in NeRF three-dimensional scene reconstruction.
  • the present invention also proposes a client for any volume rendering system that embeds 2D/3D video in NeRF three-dimensional scene reconstruction.
  • the advantages of the present invention are: compared with existing volumetric video production tools, the present invention greatly shortens the production cycle of volumetric videos and reduces production costs; at the same time, it increases the editability of volumetric videos.
  • Figure 1 is a block diagram of the NeRF three-dimensional scene fusion 2D/3D video technology of the present invention
  • Figure 2 is a schematic diagram of light sampling during the rendering process of the present invention.
  • the present invention uses the volume rendering principle of NeRF to propose a rendering pipeline: embedding 2D/3D video into a designated area in a three-dimensional scene during the volume rendering process to achieve the purpose of integrating 2D/3D and NeRF three-dimensional scenes.
  • the volume rendering of the present invention only considers voxel absorption. Voxels are composed of cold black particles that have a certain probability of absorbing all the light that hits them. The voxels do not emit light or scatter light.
  • the present invention includes the following key technical points: using 2D/3D video to enrich NeRF reconstructed three-dimensional scenes; and using the principle of volume rendering to jointly render 2D/3D video and NeRF models.
  • the current volume video production process has problems such as high shooting costs and long production cycles.
  • the volume video production method based on NeRF 3D reconstruction can effectively reduce shooting costs and post-production time. Limited by computing power and memory overhead, the current NeRF reconstruction method can only reconstruct a limited range of scenes, and the richness of the reconstructed volume video is affected.
  • the present invention proposes a joint volume rendering technology to embed 2D/3D video into the three-dimensional scene reconstructed by NeRF to obtain richer content and more Immersive volumetric video.
  • the overall technical framework of the present invention is shown in Figure 1. Because the NeRF model implicitly represents the three-dimensional scene into the neural network, the camera parameters of the viewing angle and the embedding position of the 2D/3D video are first input into the trained NeRF offline model to confirm the rendered three-dimensional scene area. .
  • the embedded 2D/3D video stream is equivalent to explicitly inserting the video stream in the NeRF three-dimensional scene (equivalent to placing a display screen in the space)
  • the 2D or 3D video stream is embedded into the corresponding NeRF three-dimensional space scene, and finally the fused binocular RGB image is input.
  • the specific implementation details of each module are introduced:
  • Step S1 viewing viewpoint camera parameters: Different viewing devices obtain viewpoint camera parameters in different ways.
  • the binocular camera parameters of the viewing viewpoint can be obtained directly;
  • the parameters can be obtained through real-time Human eye recognition and positioning technology obtains the binocular camera parameters of the viewpoint.
  • Camera parameters include external parameter matrices and internal parameter matrices. Three-dimensional space points can be mapped to image space through external parameters and internal parameters.
  • Camera external parameter matrix includes rotation matrix and translation matrix. The rotation matrix and translation matrix jointly describe how to convert points from the world coordinate system to the camera coordinate system.
  • the camera intrinsic parameter matrix is used to convert the image coordinate system into a pixel coordinate system. In subsequent rendering, after the viewing angle is determined, the camera parameters are used to map the three-dimensional space points in the viewing angle direction to the image space to generate the corresponding two-dimensional RGB image.
  • Step S2 Embedding position: Based on the size of the NeRF reconstructed scene and the resolution of the 2D/3D video, an automated position recommendation algorithm is used to recommend the most suitable video embedding position, while supporting manual adjustment.
  • Step S3 NeRF offline model: collect video through a multi-channel camera array, and train the NeRF light field model to save the information of the volume video.
  • Step S4 2D/3D video stream: it can be existing video material, or it can be a video stream collected in real time.
  • Step S5 voxelize the picture frame: voxelize the picture frame according to the video embedding position and resolution determined in step S2.
  • the thickness of the voxelization of the picture frame can be determined according to the requirements of the presentation effect.
  • the information saved by each voxel includes RGB three-channel color value (0 ⁇ 255) and opacity value (0 ⁇ 1).
  • the color value refers to the original picture frame.
  • the opacity can be set freely.
  • the voxel opacity is the value after light passes through the voxel. The probability of being absorbed.
  • Step S6 NeRF offline model rendering and 2D/3D video stream rendering are integrated, that is, joint rendering.
  • the image rendering process for a certain viewpoint is divided into the following steps:
  • Step S61 Confirm the sampling area of the light through the parameters determined in step S1;
  • Step S62 As shown in Figure 2, the traveling direction of the light is obtained according to the camera parameters, and the color of the voxel is integrated along the traveling direction until the ray is absorbed. The integrated value is the color value of this sample, and the voxel's Color and opacity are calculated by the NeRF model;
  • Step S63 If the voxel coincides with the voxel predefined in step S5 while the light is traveling, the color value and opacity value of the voxel in step S5 are selected;
  • Step S64 Sample the RGB color value of each pixel 100 times and average it (eliminate statistical errors).
  • the present invention proposes a volume rendering method and system for embedding 2D/3D video in NeRF three-dimensional scene reconstruction.
  • the camera parameters of the viewing angle and the embedding position of the 2D/3D video are input into the NeRF offline model that has been trained. to confirm the rendered 3D scene area.
  • Explicitly insert the 2D/3D video stream into the NeRF three-dimensional scene embed the 2D or 3D video stream into the corresponding NeRF three-dimensional space scene through volume rendering, and finally input the fused binocular RGB image.
  • the present invention makes full use of existing 2D/3D video resources and improves the richness of the volume video material library.
  • the existing volumetric video production cycle is shortened, the production cost is reduced, and the editability of the volumetric video is increased.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Graphics (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Generation (AREA)

Abstract

本发明提出了一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法和系统,包括:获取观看视角参数和2D/3D视频的嵌入位置,输入到已训练完的NeRF离线模型中,得到NeRF三维空间场景;对待嵌入的2D或3D视频流进行图片帧体素化处理后嵌入该NeRF三维空间场景中,得到视频嵌入三维空间场景;对该视频嵌入三维空间场景进行联合体积渲染,得到该视角参数下嵌入该2D或3D视频流的三维视频。本发明将已有的2D/3D视频资源得到充分利用,提高体积视频素材库的丰富程度。缩短了现有体积视频制作周期,减少了制作成本。

Description

在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统 技术领域
本发明涉及计算机图像学和三维重建技术领域,并特别涉及一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统。
背景技术
神经辐射场NeRF(neural radiance field)是一种隐式的三维场景表示方法,可将一个复杂的静态场景用一个神经网络来建模,可对静态和动态(含时)场景进行建模,经过训练的NeRF模型能够计算出三维空间中任意时间(动态场景),任意坐标下的体素密度值,以及某个射线行进方向的颜色值。利用NeRF重建的场景和视频可以支持自由视点的观看,给用户带来更加沉浸的体验。
虽然NeRF的方法可以较好的重建三维场景,但是需要苛刻的拍摄条件,例如现阶段的体积视频制作需要昂贵的相机阵列拍摄设备和大量的后期制作时间,基于NeRF的体积视频制作方法可以减少相机阵列的数量,减少后期制作的时间,但体积视频由于其采集制作成本高,难以大规模推广。
发明公开
针对现有技术的不足,本发明结合NeRF的场景表达方式,将已有的2D/3D视频资源得到充分利用,提高体积视频素材库的丰富程度。解决了现有体积视频制作周期长,成本高的问题。并且本发明可以任意插入丰富的2D/3D视频资源,解决了先有体积视频内容不丰富,艺术表达力不够的问题。
具体来说本发明还提出了一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法,其中包括:
步骤1、获取观看视角参数和2D/3D视频的嵌入位置,输入到已训练完的NeRF离线模型中,得到NeRF三维空间场景;
步骤2、对待嵌入的2D或3D视频流进行图片帧体素化处理后嵌入该NeRF三维空间场景中,得到视频嵌入三维空间场景;
步骤3、对该视频嵌入三维空间场景进行联合体积渲染,得到该视角参数 下嵌入该2D或3D视频流的三维视频。
所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法,其中该步骤1包括从头戴式VR显示器获取该观看视角参数,或通过实时人眼识别定位,获取视点的双目相机参数作为该观看视角参数。
所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法,其中步骤2中该图片帧体素化包括:
按该视频嵌入位置及其覆盖的分辨率,把视频流的图片帧体素化,每个体素保存的信息包括RGB三通道颜色值和体素不透明度,颜色值参照原图片帧,体素不透明度为光线通过体素后被吸收的概率。
所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法,其中该步骤3包括:
根据该观看视角参数确认光线的采样区域,沿着光线的行进方向对经过体素的颜色和体素不透明度进行积分,直到射线被吸收,积分值为此次采样的颜色值作为当前帧的渲染结果;光线在行进过程中体素如果和视频流的体素重合,则选用视频流的体素颜色值和不透明度值;集合所有帧的渲染结果构成该三维视频。
本发明还提出了一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统,其中包括:
初始模块,用于获取观看视角参数和2D/3D视频的嵌入位置,输入到已训练完的NeRF离线模型中,得到NeRF三维空间场景;
嵌入模块,用于对待嵌入的2D或3D视频流进行图片帧体素化处理后嵌入该NeRF三维空间场景中,得到视频嵌入三维空间场景;
渲染模块,用于对该视频嵌入三维空间场景进行联合体积渲染,得到该视角参数下嵌入该2D或3D视频流的三维视频。
所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统,其中该初始模块用于从头戴式VR显示器获取该观看视角参数,或通过实时人眼识别定位,获取视点的双目相机参数作为该观看视角参数。
所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统,其中该图片帧体素化包括:
按该视频嵌入位置及其覆盖的分辨率,把视频流的图片帧体素化,每个体 素保存的信息包括RGB三通道颜色值和体素不透明度,颜色值参照原图片帧,体素不透明度为光线通过体素后被吸收的概率。
所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统,其中该渲染模块包括:
根据该观看视角参数确认光线的采样区域,沿着光线的行进方向对经过体素的颜色和体素不透明度进行积分,直到射线被吸收,积分值为此次采样的颜色值作为当前帧的渲染结果;光线在行进过程中体素如果和视频流的体素重合,则选用视频流的体素颜色值和不透明度值;集合所有帧的渲染结果构成该三维视频。
本发明还提出了一种存储介质,用于存储执行所述任意一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法的程序。
本发明还提出了一种客户端,用于所述任意一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统。
由以上方案可知,本发明的优点在于:与现有的体积视频制作工具对比,本发明大大缩短体积视频的制作周期,减少制作成本;同时增加了体积视频的可编辑性。
附图简要说明
图1为本发明NeRF三维场景融合2D/3D视频技术框图;
图2为本发明渲染过程中光线采样示意图。
实现本发明的最佳方式
本发明利用NeRF的体积渲染原理,提出了一种渲染流水线:将2D/3D视频在体积渲染过程中嵌入到三维场景中指定区域,以达到2D/3D与NeRF三维场景融合的目的。本发明体积渲染指的是仅考虑体素吸收,体素为有一定概率吸收所有撞到他们身上光的冷的黑色粒子组成,体素不发光,也不散射光。为了实现上述技术效果,本发明包括以下关键技术点:利用2D/3D视频来丰富NeRF重建三维场景;以及利用体积渲染的原理来将2D/3D视频和NeRF模型进行联合渲染。
为让本发明的上述特征和效果能阐述的更明确易懂,下文特举实施例,并配合说明书附图作详细说明如下。
随着AR/VR,裸眼3D、全息显示设备的快速发展,人们对与3D视频甚至自由视点体积视频的需求越来越高,现阶段体积视频生产流程具有拍摄成本高、制作周期长等问题。基于NeRF三维重建的体积视频制作方法可以有效的降低拍摄成本和后期制作时长。受限于算力和内存开销限制,现阶段的NeRF重建方法只能重建有限范围的场景,重建体积视频的丰富度受到了影响。为了丰富体积视频内容的多样性,充分利用已有的2D/3D视频资源,本发明提出了一种联合体积渲染技术,将2D/3D视频嵌入到NeRF重建的三维场景中,得到内容丰富、更具沉浸感的体积视频。
本发明的整体技术框架如图1所示。因为NeRF模型是隐式的将三维场景表示到神经网络中,所以首先将观看视角的相机参数和2D/3D视频的嵌入位置输入到已经训练完成的NeRF离线模型中,以确认渲染的三维场景区域。嵌入的2D/3D视频流相当于在NeRF三维场景中显式的插入视频流(相当于在空间放了个显示屏)
接下来在体渲染过程中,将2D或3D视频流嵌入到对应的NeRF三维空间场景中,最后输入融合后的双目RGB图像。接下来介绍每个模块的具体实现细节:
步骤S1,观看视点相机参数:不同观看设备的视点相机参数获取方式不同,对于VR/AR头显,可以直接获取观看视点的双目相机参数;对于3D光场显示器和全息投影技术,可以通过实时人眼识别定位技术获取视点的双目相机参数。
相机参数包括外参矩阵和内参矩阵,通过外参和内参可以把三维空间点映射到图像空间。摄像机外参矩阵:包括旋转矩阵和平移矩阵,旋转矩阵和平移矩阵共同描述了如何把点从世界坐标系转换到摄像机坐标系。摄像机内参矩阵用于将图像坐标系转化为像素坐标系。后续渲染中,确定了观看视角后,通过相机参数来将视角方向的三维空间点映射到图像空间,产生对应的二维RGB图像。
步骤S2,嵌入位置:根据NeRF重建场景的尺寸和2D/3D视频的分辨率,使用自动化位置推荐算法,推荐最适合的视频嵌入位置,同时支持手动调节。
步骤S3,NeRF离线模型:通过多路相机阵列采集视频,训练NeRF光场模型保存体积视频的信息。
步骤S4,2D/3D视频流:可以是已有的视频素材,也是可以是实时采集的视频流。
步骤S5,图片帧体素化:按步骤S2中的确定的视频嵌入位置和分辨率,把图片帧进行体素化,可以根据呈现效果的需求来确定图片帧体素化的厚度。每个体素保存的信息包括RGB三通道颜色值(0~255)和不透明度值(0~1),颜色值参照原图片帧,不透明度可以自由设置,体素不透明度为光线通过体素后是否被吸收的概率。
步骤S6,NeRF离线模型渲染和2D/3D视频流的渲染融合,即联合渲染。对于某一视点的图像渲染流程,分为以下步骤:
步骤S61.通过步骤S1中确定的参数确认光线的采样区域;
步骤S62.如图2所示,根据相机参数得到光线的行进方向,沿着该行进方向对经过体素的颜色进行积分,直到射线被吸收,积分值为此次采样的颜色值,体素的颜色和不透明度由NeRF模型计算得到;
步骤S63.光线在行进过程中体素如果和步骤S5中预定义的体素重合,则选用步骤S5中体素的颜色值和不透明度值;
步骤S64.对每个像素点的RGB颜色值采样100次取平均(消除统计误差)。
工业应用性
本发明提出了一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法和系统,首先将观看视角的相机参数和2D/3D视频的嵌入位置输入到已经训练完成的NeRF离线模型中,以确认渲染的三维场景区域。在NeRF三维场景中显式的插入2D/3D视频流,通过体渲染过将2D或3D视频流嵌入到对应的NeRF三维空间场景中,最后输入融合后的双目RGB图像。本发明将已有的2D/3D视频资源得到充分利用,提高体积视频素材库的丰富程度。缩短了现有体积视频制作周期,减少了制作成本,同时增加了体积视频的可编辑性。

Claims (10)

  1. 一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法,其特征在于,包括:
    步骤1、获取观看视角参数和2D/3D视频的嵌入位置,输入到已训练完的NeRF离线模型中,得到NeRF三维空间场景;
    步骤2、对待嵌入的2D或3D视频流进行图片帧体素化处理后嵌入该NeRF三维空间场景中,得到视频嵌入三维空间场景;
    步骤3、对该视频嵌入三维空间场景进行联合体积渲染,得到该视角参数下嵌入该2D或3D视频流的三维视频。
  2. 如权利要求1所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法,其特征在于,该步骤1包括从头戴式VR显示器获取该观看视角参数,或通过实时人眼识别定位,获取视点的双目相机参数作为该观看视角参数。
  3. 如权利要求1所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法,其特征在于,步骤2中该图片帧体素化包括:
    按该视频嵌入位置及其覆盖的分辨率,把视频流的图片帧体素化,每个体素保存的信息包括RGB三通道颜色值和体素不透明度,颜色值参照原图片帧,体素不透明度为光线通过体素后被吸收的概率。
  4. 如权利要求1所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法,其特征在于,该步骤3包括:
    根据该观看视角参数确认光线的采样区域,沿着光线的行进方向对经过体素的颜色和体素不透明度进行积分,直到射线被吸收,积分值为此次采样的颜色值作为当前帧的渲染结果;光线在行进过程中体素如果和视频流的体素重合,则选用视频流的体素颜色值和不透明度值;集合所有帧的渲染结果构成该三维视频。
  5. 一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统,其特征在于,包括:
    初始模块,用于获取观看视角参数和2D/3D视频的嵌入位置,输入到已训练完的NeRF离线模型中,得到NeRF三维空间场景;
    嵌入模块,用于对待嵌入的2D或3D视频流进行图片帧体素化处理后嵌入 该NeRF三维空间场景中,得到视频嵌入三维空间场景;
    渲染模块,用于对该视频嵌入三维空间场景进行联合体积渲染,得到该视角参数下嵌入该2D或3D视频流的三维视频。
  6. 如权利要求5所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统,其特征在于,该初始模块用于从头戴式VR显示器获取该观看视角参数,或通过实时人眼识别定位,获取视点的双目相机参数作为该观看视角参数。
  7. 如权利要求5所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统,其特征在于,该图片帧体素化包括:
    按该视频嵌入位置及其覆盖的分辨率,把视频流的图片帧体素化,每个体素保存的信息包括RGB三通道颜色值和体素不透明度,颜色值参照原图片帧,体素不透明度为光线通过体素后被吸收的概率。
  8. 如权利要求5所述的在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统,其特征在于,该渲染模块包括:
    根据该观看视角参数确认光线的采样区域,沿着光线的行进方向对经过体素的颜色和体素不透明度进行积分,直到射线被吸收,积分值为此次采样的颜色值作为当前帧的渲染结果;光线在行进过程中体素如果和视频流的体素重合,则选用视频流的体素颜色值和不透明度值;集合所有帧的渲染结果构成该三维视频。
  9. 一种存储介质,用于存储执行如权利要求1到4所述任意一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法的程序。
  10. 一种客户端,用于权利要求5至8中任意一种在NeRF三维场景重建中嵌入2D/3D视频的体积渲染系统。
PCT/CN2022/110907 2022-08-08 2022-08-08 在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统 WO2024031251A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/110907 WO2024031251A1 (zh) 2022-08-08 2022-08-08 在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/110907 WO2024031251A1 (zh) 2022-08-08 2022-08-08 在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统

Publications (1)

Publication Number Publication Date
WO2024031251A1 true WO2024031251A1 (zh) 2024-02-15

Family

ID=89850231

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/110907 WO2024031251A1 (zh) 2022-08-08 2022-08-08 在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统

Country Status (1)

Country Link
WO (1) WO2024031251A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100060640A1 (en) * 2008-06-25 2010-03-11 Memco, Inc. Interactive atmosphere - active environmental rendering
CN113888689A (zh) * 2021-11-05 2022-01-04 上海壁仞智能科技有限公司 图像渲染模型训练、图像渲染方法及装置
CN114119838A (zh) * 2022-01-24 2022-03-01 阿里巴巴(中国)有限公司 体素模型与图像生成方法、设备及存储介质
CN114627223A (zh) * 2022-03-04 2022-06-14 华南师范大学 一种自由视点视频合成方法、装置、电子设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100060640A1 (en) * 2008-06-25 2010-03-11 Memco, Inc. Interactive atmosphere - active environmental rendering
CN113888689A (zh) * 2021-11-05 2022-01-04 上海壁仞智能科技有限公司 图像渲染模型训练、图像渲染方法及装置
CN114119838A (zh) * 2022-01-24 2022-03-01 阿里巴巴(中国)有限公司 体素模型与图像生成方法、设备及存储介质
CN114627223A (zh) * 2022-03-04 2022-06-14 华南师范大学 一种自由视点视频合成方法、装置、电子设备及存储介质

Similar Documents

Publication Publication Date Title
CN102592275B (zh) 虚拟视点绘制方法
Naemura et al. Real-time video-based modeling and rendering of 3D scenes
CN1144157C (zh) 用于从顺序的2d图象数据产生3d模型的系统和方法
CN105704479B (zh) 3d显示系统用的测量人眼瞳距的方法及系统和显示设备
CN110798673B (zh) 基于深度卷积神经网络的自由视点视频生成及交互方法
CN108513123B (zh) 一种集成成像光场显示的图像阵列生成方法
WO2019041351A1 (zh) 一种3d vr视频与虚拟三维场景实时混叠渲染的方法
CN111325693B (zh) 一种基于单视点rgb-d图像的大尺度全景视点合成方法
CN111047510A (zh) 一种基于标定的大视场角图像实时拼接方法
CN1860503A (zh) 图像再现的运动控制
CN108573521B (zh) 基于cuda并行计算框架的实时交互式裸眼3d显示方法
JP2016537901A (ja) ライトフィールド処理方法
CN111047709A (zh) 一种双目视觉裸眼3d图像生成方法
CN113238472B (zh) 基于频域位移的高分辨率光场显示方法及装置
CN107562185B (zh) 一种基于头戴vr设备的光场显示系统及实现方法
CN115482323A (zh) 一种基于神经辐射场的立体视频视差控制与编辑方法
CN116418961A (zh) 一种基于三维场景风格化的光场显示方法及系统
WO2023004559A1 (en) Editable free-viewpoint video using a layered neural representation
WO2024031251A1 (zh) 在NeRF三维场景重建中嵌入2D/3D视频的体积渲染方法及系统
CN110149508A (zh) 一种基于一维集成成像系统的阵列图生成及填补方法
Seitner et al. Trifocal system for high-quality inter-camera mapping and virtual view synthesis
Zhang et al. A portable multiscopic camera for novel view and time synthesis in dynamic scenes
Chen et al. Automatic 2d-to-3d video conversion using 3d densely connected convolutional networks
CN116503536B (zh) 一种基于场景分层的光场渲染方法
CN116991296B (zh) 一种物体编辑的方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22954231

Country of ref document: EP

Kind code of ref document: A1