CN117333637B

CN117333637B - Modeling and rendering method, device and equipment for three-dimensional scene

Info

Publication number: CN117333637B
Application number: CN202311631762.7A
Authority: CN
Inventors: 方顺; 崔铭; 冯星; 张志恒; 张亚男; 吕艳娜; 李荣华; 傅晨阳; 刘娟娟; 刘晓涛
Original assignee: Beijing Xuanguang Technology Co ltd
Current assignee: Beijing Xuanguang Technology Co ltd
Priority date: 2023-12-01
Filing date: 2023-12-01
Publication date: 2024-03-08
Anticipated expiration: 2043-12-01
Also published as: CN117333637A

Abstract

The application discloses a modeling and rendering method, device and equipment of a three-dimensional scene, relates to the technical field of three-dimensional modeling, and can realize accurate modeling of the three-dimensional scene and restore the shape, appearance and illumination of a 3D model with high quality. The method comprises the following steps: acquiring a first sampling point obtained by 3D space sampling of a scene picture to be modeled, extracting information of the scene picture to be modeled by utilizing a pre-trained model network according to attribute information of the first sampling point to obtain the volume density of a second sampling point and parameter information of the second sampling point on different material dimensions, extracting the color of the scene picture to be modeled by utilizing a pre-trained color generation network according to the parameter information of the second sampling point on different material dimensions to obtain color information of the second sampling point, and determining three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions and the color information of the second sampling point.

Description

Modeling and rendering method, device and equipment for three-dimensional scene

Technical Field

The present disclosure relates to the field of three-dimensional modeling technologies, and in particular, to a method, an apparatus, and a device for modeling and rendering a three-dimensional scene.

Background

Modeling of a three-dimensional scene refers to a process of creating and designing a three-dimensional environment using computer software, and by constructing a virtual three-dimensional scene, a geographic environment, a building, a landscape, etc. in the real world can be simulated to create an imagined scene.

In the related art, modeling of a three-dimensional scene can be achieved through computer graphics, which process is equivalent to a strict mathematical model or an empirical model, for example, based on a bi-directional reflection distribution function and a global illumination model. The rays emitted by each pixel of the camera screen and objects in the scene are subjected to multiple rebound refraction, so that a real three-dimensional scene is restored, and then a subsequent model is built, so that the computer graphics method can be used for sufficiently accurate calculation, but the method is approximate calculation in actual use, and particularly for complex multi-object scenes and multi-light source situations, larger calculation amount is still required.

Disclosure of Invention

In view of this, the present application provides a method, apparatus and device for modeling and rendering a three-dimensional scene, which mainly aims to solve the problem that in the prior art, a calculation amount is required to be larger in a three-dimensional scene modeling manner based on computer graphics.

According to a first aspect of the present application, there is provided a modeling method of a three-dimensional scene, including:

acquiring a first sampling point obtained by 3D space sampling of a scene picture to be modeled;

according to the attribute information of the first sampling point, information extraction is carried out on a scene picture to be modeled by utilizing a pre-trained model network, so that the volume density of a second sampling point and the parameter information of the second sampling point on different material dimensions are obtained, wherein the second sampling point is obtained by sampling rays of a preset visual angle in the scene to be modeled through a 3D space;

according to the parameter information of the second sampling point on different material dimensions, performing color extraction on the scene picture to be modeled by utilizing a pre-trained color generation network to obtain the color information of the second sampling point;

and determining three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions and the color information of the second sampling point.

According to a second aspect of the present application, there is provided a rendering method of a three-dimensional scene, including:

extracting the volume density of the second sampling point and the color information of the second sampling point from the three-dimensional scene model data;

According to the volume density of the second sampling point and the color information of the second sampling point, performing screen pixel volume rendering and light source volume rendering on a scene picture to be modeled, wherein the screen pixel volume rendering is used for performing color rendering on a 3D model, and the light source volume rendering is used for performing shadow rendering on the 3D model;

the three-dimensional scene model data are obtained by using a modeling method of the three-dimensional scene.

According to a third aspect of the present application, there is provided a modeling apparatus of a three-dimensional scene, comprising:

the first acquisition unit is used for acquiring a first sampling point obtained by 3D space sampling of a scene picture to be modeled;

the first extraction unit is used for extracting information of a scene picture to be modeled by utilizing a pre-trained model network according to the attribute information of the first sampling point to obtain the volume density of a second sampling point and the parameter information of the second sampling point on different material dimensions, wherein the second sampling point is obtained by sampling rays of a preset visual angle in the scene to be modeled through a 3D space;

the second extraction unit is used for carrying out color extraction on the scene picture to be modeled by utilizing a pre-trained color generation network according to the parameter information of the second sampling point on different material dimensions to obtain the color information of the second sampling point;

The determining unit is used for determining three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions and the color information of the second sampling point.

According to a fourth aspect of the present application, there is provided a rendering apparatus of a three-dimensional scene, including:

the extraction unit is used for extracting the volume density of the second sampling point and the color information of the second sampling point from the three-dimensional scene model data;

the rendering unit is used for performing screen pixel volume rendering and light source volume rendering on the scene picture to be modeled according to the volume density of the second sampling point and the color information of the second sampling point, wherein the screen pixel volume rendering is used for performing color rendering on the 3D model, and the light source volume rendering is used for performing shadow rendering on the 3D model;

According to a fifth aspect of the present application there is provided a computer device comprising a memory storing a computer program and a processor implementing the steps of the method of the first aspect described above when the computer program is executed by the processor.

According to a sixth aspect of the present application there is provided a readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of the first aspect described above.

By means of the technical scheme, compared with the modeling mode of realizing the three-dimensional scene through computer graphics in the prior art, the modeling and rendering method, device and equipment for the three-dimensional scene provided by the application are characterized in that the first sampling point obtained through 3D space sampling of the scene picture to be modeled is obtained, according to attribute information of the first sampling point, information extraction is firstly carried out on the scene picture to be modeled by utilizing a pre-trained model network to obtain the volume density of the second sampling point and the parameter information of the second sampling point in different material dimensions, wherein the second sampling point is obtained through 3D space sampling of rays of a preset view angle in the scene to be modeled, then the color extraction is carried out on the scene picture to be modeled by utilizing a pre-trained color generation network according to the parameter information of the second sampling point, and the three-dimensional scene model data of the scene picture to be modeled are determined according to the volume density of the second sampling point, the parameter information of the second sampling point in different material dimensions and the color information of the second sampling point. The method comprises the steps of firstly estimating attribute information on different material dimensions of a scene picture to be modeled through a model network, then estimating color information of the scene picture to be modeled through a color generation network, and acquiring the scene information to be modeled by combining the attribute information and the color information on the different material dimensions.

The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:

FIG. 1 is a flow chart of a method for modeling a three-dimensional scene in an embodiment of the present application;

FIG. 2 is a flow chart of step 201 in FIG. 2;

FIG. 3 is a flow chart of a method of modeling a three-dimensional scene in another embodiment of the present application;

FIG. 4 is a flow chart of a method for modeling a three-dimensional scene in another embodiment of the present application;

FIG. 5 is a schematic structural diagram of a modeling framework of a three-dimensional scene in an embodiment of the present application;

FIG. 6 is a schematic diagram of a shape network according to an embodiment of the present application;

FIG. 7A is a schematic diagram of a material network according to an embodiment of the present disclosure;

FIG. 7B is a schematic diagram of a geometry proton network according to one embodiment of the present disclosure;

FIG. 7C is a schematic diagram of a proton network of a roughness material according to an embodiment of the present disclosure;

FIG. 7D is a schematic diagram of a sub-network of specular reflective materials according to an embodiment of the present disclosure;

FIG. 7E is a schematic diagram of a diffuse reflection material subnetwork according to an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of a color generation network according to an embodiment of the present application;

FIG. 9 is a flow chart of a method for rendering a three-dimensional scene according to an embodiment of the present application

FIG. 10 is a schematic structural diagram of a modeling apparatus for a three-dimensional scene in an embodiment of the present application;

FIG. 11 is a schematic structural diagram of a rendering device for three-dimensional scene in an embodiment of the present application;

fig. 12 is a schematic diagram of an apparatus structure of a computer device according to an embodiment of the present invention.

Detailed Description

The present disclosure will now be discussed with reference to several exemplary embodiments. It should be understood that these embodiments are discussed only to enable those of ordinary skill in the art to better understand and thus practice the teachings of the present invention, and are not meant to imply any limitation on the scope of the invention.

As used herein, the term "comprising" and variants thereof are to be interpreted as meaning "including but not limited to" open-ended terms. The term "based on" is to be interpreted as "based at least in part on". The terms "one embodiment" and "an embodiment" are to be interpreted as "at least one embodiment. The term "another embodiment" is to be interpreted as "at least one other embodiment".

In the related art, modeling of a three-dimensional scene can be achieved through computer graphics, which process is equivalent to a strict mathematical model or an empirical model, for example, based on a bi-directional reflection distribution function and a global illumination model. The rays emitted by each pixel of the camera screen and objects in the scene are subjected to multiple rebound refraction, so that a real three-dimensional scene is restored, and then a subsequent model is built, so that the computer graphics method can be used for sufficiently accurate calculation, but the method is approximate calculation in actual use, and particularly for complex multi-object scenes and multi-light source situations, larger calculation amount is still required. With the continuous emergence of machine learning methods, the modeling method based on the reverse rendering thought can perform single-element restoration on information in a scene to be modeled, but scene information restored by the single element cannot describe object attributes in the scene to be modeled in an omnibearing manner. In order to solve the problem, the method inputs the scene picture to be modeled into the model network based on the reverse rendering thought, extracts physical attributes in the scene to be modeled through the model network, and determines three-dimensional scene model data of the scene picture to be modeled according to the physical attribute information, so that the shape, appearance and illumination information of the 3D model corresponding to the scene to be modeled are restored with high quality.

Specifically, the embodiment provides a modeling method of a three-dimensional scene, as shown in fig. 1, where the method is applied to a server corresponding to three-dimensional modeling, and includes the following steps:

101. and acquiring a first sampling point obtained by 3D space sampling of the scene picture to be modeled.

The scene picture to be molded can be a two-dimensional picture obtained by shooting at any view angle through camera equipment in the scene to be molded. The 3D spatial sampling is equivalent to sampling a three-dimensional voxel space corresponding to a scene picture to be modeled, and the first sampling point obtained by sampling is a voxel instead of a pixel.

In this embodiment, the 3D spatial sampling may include camera ray sampling and light source ray sampling. The camera ray sampling is used for determining the color of each sampling point, so that the weight accumulation color of the screen pixel is obtained through volume rendering, for the acquisition process of the camera ray, a view with one view angle can be selected from the input view, the view is regarded as a camera screen with the corresponding view angle position in the scene to be molded, the view pixel is the screen pixel of the camera, each pixel of the camera emits a ray into the scene to be molded, and one pixel of the camera emits a camera ray. The light source sampling is used for determining the brightness of each sampling point so as to generate a shadow effect, the current imaging plane can be determined after the position and the orientation of the camera equipment are given in the acquisition process of the light source light, and then the position coordinates of the camera are connected with a certain pixel on the plane to obtain the light source light. It should be noted that, the light source is an additional light source in the scene to be molded, which is equivalent to a newly added light source after the scene is created. The method specifically comprises the steps of sampling camera rays and light source rays by using two sampling methods, wherein one sampling method is uniform sampling, the other sampling method is importance sampling, the sampling strategy can firstly execute uniform sampling, importance sampling is executed on the basis of uniform sampling, a bounding box of an object can be determined before executing uniform sampling, then a focus of the camera rays and the bounding box is obtained, the inside or outside of the object is determined through the odd-even number of the focus, the intersection point of the rays and the bounding box is an odd number, which means that the rays penetrate into the bounding box, otherwise, the even number penetrates out of the bounding box, the inner area of the bounding box is arranged between the odd number and the even number, then uniform sampling is executed, relatively sparse sampling points are used near and inside the intersection point of the rays and the bounding box, the volume density and the color of each uniform sampling point are evenly distributed on a straight line, after the volume density of each sampling point is evenly sampled, the contribution of the final color or brightness is maximum at a place with high volume density is obtained, and relatively dense importance sampling is carried out again at a place with high volume density. The embodiment uses the two sampling strategies for the camera ray sampling and the light source ray sampling, and can greatly reduce the number of sampling points and improve the overall performance and accuracy through the two sampling strategies.

In practical application, in the process of 3D space sampling of a scene picture to be modeled, the modeling effect of a three-dimensional scene can be improved by sampling on different scales, global sampling is firstly performed on a coarse-granularity scale to obtain overall scene information, and then local sampling is performed on a fine-granularity scale to obtain more detail information, so that the multi-scale sampling strategy can effectively balance the image quality and the calculation complexity.

For example, in the 3D spatial sampling process, 64 points are uniformly sampled on a ray by using uniform sampling, overall scene information is obtained, then the volume density of which place is determined by the volume density suddenly increases, and 32 sampling points are allocated again between sampling points obtained by two uniform sampling before and after the position where the volume density increases for random sampling.

The execution body of the embodiment can be a modeling device or equipment of a three-dimensional scene, the device or equipment can be configured at a service end corresponding to three-dimensional modeling, for an opaque object, the sampling process can be further simplified, only an intersection point of a camera ray and the surface of the object needs to be calculated, the intersection point is connected with a light source, the brightness of the point is calculated, the brightness of all other sampling points is calculated by using the one sampling point, for a transparent object, a sampling process can be simplified, only a limited sampling point is selected in a region with high camera ray density, then the sampling points are connected with the light source, the direction of the light source is sampled, the brightness information of the limited sampling points is calculated, and then the brightness information of other points is calculated by using the average value of the brightness of the limited sampling points.

102. And extracting information of the scene picture to be modeled by utilizing a pre-trained model network according to the attribute information of the first sampling point to obtain the volume density of the second sampling point and the parameter information of the second sampling point on different material dimensions.

Wherein the attribute information of the first sampling point is the position of the sampling point in the 3D space, the pre-trained model network comprises a shape network and a material network, the position of the first sampling point in the 3D space is xyz three-dimensional vector, the encoding is performed firstly, and the three-dimensional vector can be encoded by using a trigonometric function encoding modeThe output image can be sharper and not blurred by the trigonometric function coding mode, and the training of the high-frequency signal is facilitated.

The structure of the specific shape network can be provided with 8 hidden layers, 256 points of each layer are connected with an input coding position in a jumping mode after the layer 4 is finished, the input coding position is input into the layer 5 together, each layer uses Softplus as an activation function, shape information extraction is carried out on a scene picture to be modeled through a shape network part, and a model expressed by SDF is obtained.

As the information transition between the shape network and the material network, a second sampling point needs to be acquired through a model represented by an SDF, the second sampling point is obtained by sampling rays of a preset view angle in a scene to be modeled through a 3D space, the preset view angle can be obtained by using a view angle of a camera, namely, a picture of the scene to be modeled is obtained by sampling the second 3D space, and the preset view angle can also be obtained by using any view angle in the 3D space, namely, 3D sampling is performed on any rays emitted by the scene to be modeled.

SDF herein refers to a signed distance field that determines the distance of a point to the boundary of a region over a finite region in space and defines the sign of the distance at the same time: the points are positive in the boundary of the region, negative in the outside and 0 when the points are positioned on the boundary, and the model represented by the SDF can distinguish the numerical value changes of the graph reminding, the inside and the outside in the scene to be modeled.

The structure of the specific material network can be provided with a plurality of material sub-networks for extracting different material information, each material sub-network is mutually independent in the extraction process of the material information, and the plurality of material sub-networks can output parameter information of the second sampling points in different material dimensions and the volume density of the second sampling points.

For example, the volume density, SDF value, and normal of the second sample point may be obtained through a texture sub-network of geometric texture dimensions, the roughness of the second sample point may be obtained through a texture sub-network of roughness texture dimensions, the specular reflection color of the second sample point may be obtained through a texture sub-network of specular reflection texture dimensions, and the albedo and diffuse reflection color of the second sample point may be obtained through a texture sub-network of diffuse reflection texture dimensions.

103. And carrying out color extraction on the scene picture to be modeled by utilizing a pre-trained color generation network according to the parameter information of the second sampling point on different material dimensions, so as to obtain the color information of the second sampling point.

It can be understood that the parameter information of the second sampling point in different material dimensions is used as an input parameter corresponding to the pre-trained color generation network, and multi-dimensional material information can be added into the 3D model corresponding to the scene picture to be molded, where the material information is for diffuse reflection and specular reflection of the current ambient light, for generating shadows and highlights, and for solving the problem that a certain 3D point of the 3D model sees different colors at different angles, so that the color generation network can correspondingly output color information with environmental effects in the scene to be molded by the parameter information in multiple material dimensions.

104. And determining three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions and the color information of the second sampling point.

It can be understood that the attribute such as the shape, the appearance, the color and the like of the 3D model corresponding to the scene graph to be modeled can be reflected in all directions by the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions, and the color information of the second sampling point, and 3D modeling, volume rendering and the like of the scene graph to be modeled can be realized by the three-dimensional scene model data of the scene graph to be modeled, so that the picture information of the scene to be modeled is accurately restored.

Further, after three-dimensional scene model data of the scene picture to be modeled are obtained, 3D modeling can be achieved through the three-dimensional scene model data, specifically, the SDF value of the second sampling point in the shape dimension is obtained from parameter information of the second sampling point in different material dimensions, and 3D modeling is conducted on the scene picture to be modeled according to the SDF value of the second sampling point in the shape dimension.

Further, after the three-dimensional scene model data of the scene picture to be modeled is obtained, color 3D modeling can be achieved through the three-dimensional scene model data, and specifically, the color 3D modeling is conducted on the scene picture to be modeled according to the volume density of the second sampling point, the SDF value of the second sampling point in the shape dimension and the color information of the second sampling point.

Compared with the modeling mode of realizing the three-dimensional scene through computer graphics in the prior art, the modeling method of the three-dimensional scene provided by the embodiment of the application comprises the steps of obtaining a first sampling point obtained by 3D space sampling of a scene picture to be modeled, firstly extracting information of the scene picture to be modeled by utilizing a pre-trained model network according to attribute information of the first sampling point to obtain the volume density of a second sampling point and parameter information of the second sampling point in different material dimensions, wherein the second sampling point is obtained by 3D space sampling of rays of a preset view angle in the scene to be modeled, then carrying out color extraction of the scene picture to be modeled by utilizing a pre-trained color generation network according to the parameter information of the second sampling point in different material dimensions to obtain color information of the second sampling point, and determining three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point in different material dimensions and the color information of the second sampling point. The method comprises the steps of firstly estimating attribute information on different material dimensions of a scene picture to be modeled through a model network, then estimating color information of the scene picture to be modeled through a color generation network, and acquiring the scene information to be modeled by combining the attribute information and the color information on the different material dimensions.

In the above embodiment, the pre-trained model network includes a shape network and a texture network, through which extraction of shape information can be achieved, and through which extraction of different texture information can be achieved, specifically, as shown in fig. 2, step 102 includes the following steps:

201. and extracting shape information of the scene picture to be modeled by utilizing the shape network according to the attribute information of the first sampling point to obtain a model expressed by the SDF.

202. And generating a 3D model grid from the model represented by the SDF, and estimating attribute information of a second sampling point according to vertex data of the 3D model grid.

203. And extracting material information of the scene picture to be modeled by utilizing the material network according to the attribute information of the second sampling point to obtain the volume density of the second sampling point and the parameter information of the second sampling point on different material dimensions.

It will be appreciated that the model represented by the SDF output by the shape network essentially stores the closest distance of the first each sample point to the graph, i.e., the model is drawn to a surface with a point value greater than 0 outside the model surface and less than 0 inside the model surface.

Specifically, in the process of generating a 3D model grid from a model represented by an SDF, rendering the model represented by the SDF through a row cube to obtain the 3D model grid, wherein the 3D model grid has a 3D data format represented by the grid, then taking the fact that the 3D model grid needs to be sampled in consideration of a shape network to a material network, obtaining a second sampling point obtained by 3D space sampling of the 3D model grid, and estimating attribute information of the second sampling point according to vertex data of the 3D model grid.

Specifically, in the process of obtaining a second sampling point obtained by 3D space sampling of a 3D model grid, selecting rays in a scene to be modeled passing through the 3D model grid as a preset view angle according to the 3D model grid, and performing 3D space sampling on the rays of the preset view angle to obtain the second sampling point.

It should be noted that, here, the vertex data of the 3D model mesh and the second sampling point are points with two different purposes, the vertex data of the 3D model mesh is a 3D shape forming a model, the second sampling point is used for volume rendering, the current pixel color of the camera is obtained by accumulating colors of all sampling points on one camera ray, and the attribute information of the second sampling point can be estimated according to the vertex data of the 3D model mesh. Specifically, in the process of estimating the attribute information of the second sampling point according to the vertex data of the 3D model mesh, a plurality of target mesh vertices adjacent to the second sampling point in the 3D model mesh can be obtained, the attribute information of the plurality of target mesh vertices is determined according to the vertex data of the 3D model mesh, and the attribute information of the plurality of target mesh vertices is weighted and summed through a preset estimation algorithm to obtain the attribute information of the second sampling point.

The estimation process of the second sampling point is equivalent to using the attribute information of several grid vertices adjacent to the second sampling point, and then estimating the attribute information of the second sampling point by a preset estimation method. Specifically, geometric information is stored in each preset triangle vertex in the 3D model gridDiffuse reflection information->Specular reflection information->Roughness information->Through grid vertex storageStored information to estimate +.>、/>、/>And->。

The preset estimation method uses K-nearest neighbor algorithm, so that，/>，，/>. Then estimating attribute information of the second sampling point through a K-nearest neighbor algorithm as follows:

wherein,representation geometry->Diffuse reflection->Specular reflection->Roughness ofInterpolation of (2)Features, weight->。

In the above embodiment, the texture network includes a plurality of texture sub-networks for extracting different texture information, and each texture sub-network may extract attribute information of different texture dimensions, specifically, step 203 includes the following steps:

203-1, extracting geometric information of the scene picture to be modeled by using the first material sub-network according to the attribute information of the second sampling point, and obtaining an SDF value, a normal line and a volume density of the second sampling point in the shape dimension.

203-2, extracting roughness information of the scene picture to be modeled by using a second material sub-network according to the attribute information of the second sampling point, and obtaining the roughness information of the second sampling point in the appearance dimension.

203-3, extracting specular reflection information of the scene picture to be modeled by using a third material sub-network according to the attribute information of the second sampling point, so as to obtain the specular reflection information of the second sampling point in the illumination dimension.

203-4, extracting diffuse reflection information of the scene picture to be modeled by utilizing the first material sub-network according to the attribute information of the second sampling point, and obtaining diffuse reflection information of the second sampling point in the appearance dimension and the illumination dimension.

In this embodiment, the texture sub-network includes a first texture sub-network, a second texture sub-network, a third texture sub-network and a fourth texture sub-network, where the different texture sub-networks extract attribute information of different texture dimensions, the first texture sub-network is used as a geometric network for extracting SDF values, normals and bulk densities of the second sampling points in the shape dimension, the second texture sub-network is used as a roughness network for extracting roughness information of the second sampling points in the appearance dimension, the third texture sub-network is used as a specular reflection network for extracting specular reflection information of the second sampling points in the illumination dimension, and the fourth texture sub-network is used as a diffuse reflection network for extracting diffuse reflection information of the second sampling points in the appearance dimension and the illumination dimension. The attribute information of the second sampling point is required to be input into the plurality of material sub-networks, and here, the sampling distance of the second sampling point is used, the sampling distance of the second sampling point is equivalent to the distance from the second sampling point to the 3D model mesh, the distance can be estimated through the distance from the second sampling point to the vertex of the adjacent 3D model mesh, specifically, the sampling distance of the second sampling point can be estimated through a K-nearest neighbor algorithm, and the sampling distance of the second sampling point can be expressed as follows, which is the same as the formula of the K-nearest neighbor algorithm above:

Wherein the weight->。

In particular in a geometric network, the input parameters include geometric informationAnd sampling point to reconstruction grid distanceThe geometric information is mainly the three-dimensional position of the second sampling point, and the position information can be calculated by camera rays, and the formula is +.>Wherein->Is the center of the camera,/->Is the direction of observation and t is the distance travelled. The geometric network comprises a plurality of layers of perceptrons, and before the geometric network is input to the plurality of layers of perceptrons, the geometric network can be encoded by adopting a trigonometric function encoding mode so as to obtain clearer geometric characteristics. The multi-layer perceptron is provided with 4 hidden layers, 256 nodes are arranged on each layer, a ReLU activation function is adopted on each layer, and a network formula is as follows: />The output layer of the geometric network has the SDF value, normal, and bulk density of the second sample point. The SDF value is the output of the multi-layer perceptron, and represents the SDF value of the second sampling point, and the normal is obtained by gradient through the SDF value, and the formula is: />Wherein->The method is to generate a geometric network of SDF values, and the volume density is also obtained through the SDF values, and the formula is as follows:

wherein,s represents a trainable deviation parameter,

it should be noted that, the geometry network outputs the SDF value of the second sampling point, the shape network outputs the model represented by the SDF, and both output the SDF value, but the shape network generates a 3D model mesh through the SDF value to estimate attribute information of the second sampling point by using the 3D model mesh, and the geometry network is used to generate the SDF value of the second sampling point so as to match with the generated color information, so that not only the 3D model shape can be obtained by modeling, but also the color corresponding to the 3D model shape can be obtained by modeling, and the SDF value and the normal generated by the geometry network are both used as inputs of the color generating network, and the volume density generated by the geometry network is also a parameter required for volume rendering to calculate the final color of the screen. The shape generated by the 3D model can be solved by the geometric network.

In particular in a roughness network, the input parameters are learnable roughness informationAnd sample point to reconstruction grid distance +.>The roughness information can be a random value, the roughness information is output through a roughness network, wherein a multi-layer perceptron of the roughness network is provided with 4 hidden layers, 256 points are adopted in each layer, a ReLU activation function is adopted in each layer, and a network formula is as follows:roughness information in the 3D model generation appearance can be solved by a roughness network.

In particular in a specular reflection network, the input parameter is a learnable specular reflection informationAnd sample point to reconstruction grid distance +.>The specular reflection information can be a random value, and the specular reflection color is output through a specular reflection network, wherein the layer perceptron of the specular reflection network has 4 hidden layers of 256 points, each layer adopts a ReLU activation function, and the network formula is +.>Specular reflection colors in the 3D model generated illumination can be solved by a specular reflection network.

In diffuse reflection networks in particular, the input parameters are learnable diffuse reflection informationAnd sample point to reconstruction grid distance +.>The diffuse reflection information may be a random value, the albedo is output through the diffuse reflection network, then the albedo is converted into a diffuse reflection color through the environment map, each pixel of the environment map is represented as a point light source, and the diffuse reflection color is represented by the environment map >The light emitted by all the point light sources is calculated and summed, and the network formula is that：

Wherein the symbols areRepresents dot product->Is the i-th incident light,>is->N is the normal direction of the sampling point x.

The layer perceptron of the diffuse reflection network has 4 hidden layers of 256 points, each layer adopts a ReLU activation function, and the formula of the diffuse reflection network is as followsThe albedo of the appearance of the 3D model and the diffuse reflection color of the illumination of the 3D model can be solved through the diffuse reflection network.

Accordingly, in step 103, color extraction is performed on the scene picture to be modeled by using the pre-trained color generation network according to the SDF value, the normal line, the roughness information, the specular reflection information, and the diffuse reflection information of the second sampling point in the shape dimension, the appearance dimension, and the illumination dimension, to obtain the color information of the second sampling point.

Further, in the above embodiment, as shown in fig. 3, before step 103, the method further includes the following steps:

105. and acquiring influence factors of different environmental factors in the scene to be molded on the color information, and setting additional parameter information suitable for being added into a color generation network according to the influence factors so that the color information of a second sampling point output by the color generation network under the influence of the environmental factors has different color expression effects.

The influence factors of different environmental factors on the color information may include, but are not limited to, camera information, additional light source information, and illumination information generated by the additional light source, for the camera information, the additional parameter information may include a camera direction, for the additional light source information, for the illumination information generated by the additional light source, the additional parameter information may include highlight information and shadow information.

Correspondingly, in step 103, according to the parameter information of the second sampling point in the dimensions of different materials and the additional parameter information, the color of the scene picture to be modeled is extracted by using the pre-trained color generation network, so as to obtain the color information of the second sampling point.

In the color generation network, input parameters are parameter information and additional parameter information of a second sampling point in different dimensions, geometric information, appearance information and illumination information of the 3D model are reflected through the parameter information of the second sampling point in different dimensions, and camera information, additionally added light source information and illumination information generated by additionally added light sources in a scene to be molded are reflected through the additional parameter information. The color generation network comprises a multi-layer perceptron, wherein the mode of encoding input parameters is also a trigonometric function encoding mode, the multi-layer perceptron is composed of 8 hidden layers, 256 nerve cells in each layer are formed, a ReLU activation function is adopted in each layer, and a network formula is as follows:

Representing camera rays from camera position->Along the radial direction->Predicted pixel color of (a), camera raysWherein t is the ray travel distance, +.>Is the direction of camera rays, < >>Representing a functional form of a color generating network,for normalizing the normal after SDF gradient, i.e.>. Specific current sampling position +.>Normal derived from SDF->Camera direction->Point light source position->SDF value->Roughness r, diffuse reflection color->Specular reflection color->Light source color->Shadow information->And highlight information->The index i indicates the i-th sampling point, and N sampling points in total.

Further, in order to obtain the shadow information, the light source needs to be sampled, the specific sampling process can refer to the process of 3D space sampling for the first sampling point and the second sampling point, for the retching transparent object, the focus of the camera ray and the object surface can be obtained through depth, namely, the object surface is a level set with SDF of 0, and the obtained depthWherein->Is an unbiased density weight.

Further, for the highlight information, the obtained normal information can be used to input into the micro-surface reflection illumination model, and 5 roughness values are adopted, so that 5 highlight color values are obtained.

In the practical application process, a pre-trained model network and a pre-trained color generation network need to train by using a large number of sample pictures with different visual angles in a scene to be modeled, in the training process, a color value obtained by network prediction and a true value are calculated through a loss function, and network parameters are continuously adjusted according to the calculated loss value, so that the color value obtained by network prediction is close to the true value, and the training process is ended when the error is smaller than a preset value. The overall loss function may here consist of a multipart loss function, in particularWherein->Is a color reconstruction loss function, +.>Is regularized loss functionCount (n)/(l)>Is the SDF reconstruction loss function,/->Is a smoothness loss function, +.>Is a consistency loss function.

The formula of the specific color reconstruction loss function is:

wherein,representing a ray, ++>Representing the ray set in each Batch training set, +.>Screen pixel color value, which is coarse-granularity sampling prediction,/->Is a finely sampled predicted screen pixel color value, for example>Is the true screen pixel color value, the predicted value and the true value are used for calculating the loss value, and then the sum of all screen ray losses is calculated.

Considering that the sampling process is mainly divided into coarse-granularity uniform sampling and fine-granularity important forming sampling, the weight of the allocated samples needs to be provided for fine sampling by coarse-granularity sampling, and since the coarse-granularity sampling needs to output shadow contribution degree, and the shadow contribution degree affects the second-layer sampling allocation if inaccurate, a loss function of the coarse-granularity sampling and a loss function of the fine-granularity sampling need to be calculated respectively.

The specific regularization loss function is formulated as:wherein,at->Gradient at the point.

The formula of the specific SDF reconstruction loss function is:wherein->Is the SDF true value of the sample point, < +.>Is->Predicted values of the network.

The specific smoothness loss function is given by:wherein->The index of the adjacent vertex representing the ith vertex, the smoothness penalty is used to penalize the differences in the geometry of the adjacent vertices.

The specific formula of the consistency loss function is:wherein P is an environmental map->Is the roughness, r is the number of pixels, +.>Is from the origin of the camera to the environment map->The unit vector of the j-th pixel position (sky sphere or spherical sky box) ensures the consistency of the environment map>And->The specular reflection generated by the network is uniform.

Further, in order to facilitate flexible modification of attribute information in a scene to be modeled, in the above embodiment, as shown in fig. 4, after step 104, the method further includes the following steps:

106. and responding to an editing instruction of the scene to be modeled, and acquiring network input parameters with editability of the scene picture to be modeled in the material information extraction process.

107. And adjusting the network input parameters with the editability according to the editing instruction so as to extract information and color of the scene picture to be modeled according to the adjusted network input parameters and obtain the updated three-dimensional scene model data of the scene picture to be modeled.

In this embodiment, the editing instruction of the scene to be modeled may edit the shape, appearance and illumination information of the scene to be modeled, so as to change the modeling and/or rendering effect of the 3D model. Shape editing can employ rigid deformation to reconstruct 3D model mesh, exterior light editing can edit appearance information such as diffuse reflection, silent reflection and roughness, and considering that appearance related material information is used as input parameter of color generation network, has editability, only changes one feature at a time, for example, changes diffuse reflection feature, and other specular reflection feature and roughness feature are kept unchanged, and can be realized by setting loss function Optimization, the formula is expressed as: />Wherein->，/>All camera rays representing the rendered pixel, +.>Representing the color of the pixel being rendered,representing pixel colors of rendering components after volume rendering, rendering components such as diffuse reflection colors +.>。

Further, considering the influence of other environmental factors in the scene to be modeled on the attribute information in the scene to be modeled, on the basis of editing the scene to be modeled by using the attribute information in the scene to be modeled, the additional parameter information suitable for being added into the color generation network can be edited.

Specifically, after the network input parameters with editability are adjusted according to the editing instruction, additional parameter information suitable for being added into a color generation network in a scene to be modeled is obtained, the additional parameter information is adjusted according to updated mapping information and/or newly added illumination information in the scene to be modeled, information extraction is carried out on the scene picture to be modeled according to the adjusted network input parameters, color extraction is carried out on the scene picture to be modeled in combination with the adjusted additional attribute parameters, and three-dimensional scene model data of the updated scene picture to be modeled is obtained.

In this embodiment, the additional parameter information includes re-illumination obtained by a mapping method and re-illumination obtained by an additional light source method, and for the re-illumination obtained by the mapping method, updated mapping information in the scene to be modeled can be determined by obtaining diffuse reflection illumination and specular reflection illumination in the scene to be modeled and changing an environmental mapping in the scene to be modeled to a target environmental mapping; for the re-illumination obtained by the additional light source mode, the newly added illumination information in the scene to be modeled can be determined by acquiring the light source information in the scene to be modeled and modifying the light source information.

In particular, the re-illumination by the tiling approach can be divided into diffuse reflection illumination using a display environment tiling representation and specular reflection illumination using a multi-layer perceptron network representation. For diffuse reflection illumination, the environment map can be used for representing diffuse reflection illumination in a scene to be modeled, and the diffuse reflection illumination in the scene to be modeled can be changed easily by changing the environment map in the scene to be modeled into a target environment map. For specular reflection illumination, the specular reflection illumination in the scene to be modeled can not be directly modified, a network of the multi-layer perceptron can be used for representing the specular reflection illumination in the scene to be modeled, the specular reflection illumination in the scene to be modeled is indirectly modified by changing the environmental map in the scene to be modeled into a target environmental map, updated map information in the scene to be modeled is determined according to diffuse reflection illumination and specular reflection illumination in the scene to be modeled after modification, namely, the loss function is optimized according to the target environmental map, and the specular illumination network is enabled toAdapting to a target environment mapThe loss function formula is->Wherein S represents the number of samples of the mesh surface and P is the target environment map +.>The number of pixels in >Roughness value representing the i-th sampling point, < +.>From camera sourcePoint-to-target Environment map->Unit vector of j-th pixel position (sky sphere or spherical sky box). It is noted that here too, the normal direction n and the camera direction are assumed to be + ->Are identical.

Note that, the above-described manner of editing the specular reflection illumination is only applicable to a case where the roughness of the sampling point is small, and if the roughness of the sampling point is large, the specular effect is re-illuminated, which is not a proper illumination effect. In order to solve the problem of the specular reflection effect with larger roughness, the roughness information of a second sampling point in the scene to be modeled can be obtained before the specular reflection illumination in the scene to be modeled is indirectly changed, if the roughness information of the second sampling point is smaller than a preset threshold value, the network of the multi-layer perceptron is optimized according to the target environment map by changing the environment map in the scene to be modeled into the target environment map, in the optimization process, the specular reflection illumination in the scene to be modeled is indirectly changed, if the roughness information of the second sampling point is larger than or equal to the preset threshold value, the multi-layer texture map of the target environment map is generated according to the Monte Carlo sampling by changing the environment map in the scene to be modeled into the target environment map, and the specular reflection illumination in the scene to be modeled is indirectly changed according to the multi-layer texture map of the target environment map.

Specifically, the prefiltered environment map at different roughness levels is obtained by Monte Carlo sampling, i.e. a multi-layer texture map of the target environment map is generated, and the formula is as follows:

wherein,the representation comes from->Light in the direction, J, is the number of samples of the direction of the incident light.

It will be appreciated that the integration of the input light at different roughness levels, corresponding to different numbers of samples, results in a multi-layer texture map of the environment map having a fixed roughness value at each texture level, based on the roughness r and camera direction for each sample pointThe corresponding level of specular illumination is looked up from the multi-layer texture map, at which point the re-illumination loss function may become as follows:

where M is a pre-computed environmental multi-layer texture iron map,is from the camera direction->Roughness seen asThe color of the light at the sample point.

Specifically, a point light source can be newly added to the re-illumination obtained by the additional light source mode, the rendering effect in the scene to be modeled is changed by modifying the position and the color of the light source point, if a plurality of point light sources are required to be added, more light source positions and colors are added in the color generation network, and a plurality of light sources and colors are provided, so that a plurality of learnable input parameters are correspondingly added.

In an actual application scene, a modeling framework structure of the three-dimensional scene is shown in fig. 5, in the modeling process of the three-dimensional scene in fig. 5, according to an input scene picture to be modeled, an SDF representation, a volume density and color information of a 3D model can be output by utilizing a pre-trained network model, and the SDF representation, the volume density and the color information of the 3D model can be correspondingly adjusted by adjusting network input parameters, so that the shape, the appearance and the illumination information of the 3D model can be adjusted.

The pre-trained network model here includes a shape network, a texture network and a color generation network, in the shape network, as shown in fig. 6, an input parameter of the shape network is attribute information of a first sampling point, an output parameter is a model represented by an SDF, the model represented by the SDF is further rendered by a Marching cube algorithm to generate a 3D model grid, in the texture network, as shown in fig. 7A, the texture network includes a geometry material sub-network, a roughness material sub-network, a specular reflection material sub-network and a diffuse reflection material sub-network, an input parameter of the texture network is attribute information of a second sampling point, an output parameter is a volume density of the second sampling point and parameter information of the second sampling point in different material dimensions, in the geometry material sub-network, as shown in fig. 7B-7E, a plurality of material sub-networks included in the specific texture network, as shown in fig. 7B, the input parameter is attribute information of the second sampling point, the SDF value, the normal line, and the bulk density of the second sampling point are output through the geometric information of the second sampling point, in the roughness texture sub-network, as shown in fig. 7C, the input parameter is attribute information of the second sampling point, the roughness information of the second sampling point is output through the roughness of the second sampling point, in the specular reflection network, as shown in fig. 7D, the input parameter is attribute information of the second sampling point, the specular reflection tone of the second sampling point is output through the specular reflection of the second sampling point, in the diffuse reflection network, as shown in fig. 7E, the input parameter is the albedo of the second sampling point is output through the diffuse reflection of the second sampling point, and then the albedo is converted into the diffuse reflection color through the environmental map. In the color generation network, as shown in fig. 8, the input parameters of the color generation network are parameter information and additional parameter information of the second sampling point in different material dimensions, specifically including normal, SDF value, roughness, highlight information, shadow information, camera direction, sampling position, light source color, specular reflection color, diffuse reflection color, and the output parameters are color information of the second sampling point.

Further, as an application of the three-dimensional scene model data, the application also provides a rendering method of a three-dimensional scene, as shown in fig. 9, the method can be applied to a server corresponding to three-dimensional rendering, and the method comprises the following steps:

301. extracting the volume density of the second sampling point and the color information of the second sampling point from the three-dimensional scene model data;

302. and according to the volume density of the second sampling point and the color information of the second sampling point, performing screen pixel volume rendering and light source volume rendering on the scene picture to be modeled.

The three-dimensional scene model data are obtained by using the modeling method of the three-dimensional scene, the screen pixel volume rendering is used for performing color rendering on the 3D model, and the light source volume rendering is used for performing shadow rendering on the 3D model

Specifically for screen pixel rendering, the camera emits radiation per pixelWherein->Is the center of the camera,/->Is the direction of observation and t is the distance travelled. The color of a particle on the camera ray is multiplied by its bulk density and its cumulative transmittance, which is the contribution of the particle to the color of the screen pixel. The sum of the contributions of all particles on this ray, i.e. the color of the camera screen pixel, is given by +. >Where N is the total number of sampling points, +.>Is the contribution coefficient of the current color, +.>For its bulk density, < >>Representing the distance between two adjacent sampling points.Indicating the cumulative transmittance. />Is the color of the ith particle, which can be predicted by the color generation network. Specifically, in the training process of the color generation network, each pixel of the scene picture to be modeled can be used as a true value to verify that the color generation network predicts the color information of the second sampling point.

Specifically for light source volume rendering, the light source emits light:wherein->Is the light source position->Is the direction of the light, and t is the travel distance. The color of a particle on the light is multiplied by its bulk density and its cumulative transmittance, which is the contribution of that particle to the color of the light source. The sum of the contribution values of all particles on the light is the color of the light source, and the formula isWhere N is the total number of sampling points, +.>Is the contribution coefficient of the current color, +.>For its bulk density, < >>Representing the distance between two adjacent sampling points. />Indicating the cumulative transmittance.Is the color of the ith particle, which can be predicted by the color generation network. In the training process of the color generation network, the light source color can be white or other editing colors, and the light source color can be used as a true value to verify that the color generation network predicts the color information of the second sampling point.

Further, as a specific implementation of the method of fig. 1-4, an embodiment of the present application provides a modeling apparatus for a three-dimensional scene, as shown in fig. 10, where the apparatus includes: a first acquisition unit 41, a first extraction unit 42, a second extraction unit 43, a determination unit 44.

A first obtaining unit 41, configured to obtain a first sampling point obtained by 3D spatial sampling of a scene picture to be modeled;

the first extraction unit 42 is configured to extract information of a scene picture to be modeled by using a pre-trained model network according to attribute information of the first sampling point, so as to obtain a volume density of a second sampling point and parameter information of the second sampling point on different material dimensions, where the second sampling point is obtained by sampling a ray of a preset view angle in the scene to be modeled in a 3D space;

the second extracting unit 43 is configured to perform color extraction on the scene picture to be modeled by using a pre-trained color generating network according to the parameter information of the second sampling point in different material dimensions, so as to obtain color information of the second sampling point;

the determining unit 44 is configured to determine three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, parameter information of the second sampling point in different material dimensions, and color information of the second sampling point.

Compared with the modeling mode of realizing the three-dimensional scene through computer graphics in the prior art, the modeling device for the three-dimensional scene provided by the embodiment of the invention has the advantages that the first sampling point obtained by 3D space sampling of the scene picture to be modeled is obtained, according to the attribute information of the first sampling point, the information extraction is firstly carried out on the scene picture to be modeled by utilizing a pre-trained model network to obtain the volume density of the second sampling point and the parameter information of the second sampling point on different material dimensions, wherein the second sampling point is obtained by 3D space sampling of rays of a preset view angle in the scene to be modeled, then the color extraction is carried out on the scene picture to be modeled by utilizing a pre-trained color generation network to obtain the color information of the second sampling point, and the three-dimensional model data of the scene picture to be modeled is determined according to the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions and the color information of the second sampling point. The method comprises the steps of firstly estimating attribute information on different material dimensions of a scene picture to be modeled through a model network, then estimating color information of the scene picture to be modeled through a color generation network, and acquiring the scene information to be modeled by combining the attribute information and the color information on the different material dimensions.

In a specific application scenario, the pre-trained model network includes a shape network and a texture network, and the first extraction unit includes:

the first extraction module is used for extracting shape information of the scene picture to be modeled by utilizing the shape network according to the attribute information of the first sampling point to obtain a model expressed by the SDF;

the estimating module is used for generating a 3D model grid from the model represented by the SDF, and estimating attribute information of a second sampling point according to vertex data of the 3D model grid;

and the second extraction module is used for extracting the material information of the scene picture to be modeled by utilizing the material network according to the attribute information of the second sampling point to obtain the volume density of the second sampling point and the parameter information of the second sampling point on different material dimensions.

In a specific application scene, the estimation module is specifically configured to render the model represented by the SDF through a line cube to obtain a 3D model grid; acquiring a second sampling point obtained by 3D space sampling of the 3D model grid; and estimating attribute information of a second sampling point according to the vertex data of the 3D model grid.

In a specific application scene, the estimation module is specifically further configured to select, according to the 3D model grid, a ray in the scene to be modeled passing through the 3D model grid as a preset view angle; and 3D space sampling is carried out on the rays of the preset visual angle, and a second sampling point is obtained.

In a specific application scenario, the estimation module is specifically further configured to obtain a plurality of target grid vertices adjacent to the second sampling point in the 3D model grid, and determine attribute information of the plurality of target grid vertices according to vertex data of the 3D model grid; and carrying out weighted summation on the attribute information of the plurality of target grid vertexes through a preset estimation algorithm to obtain the attribute information of the second sampling point.

In a specific application scene, the material network comprises a plurality of material sub-networks for extracting different material information, and the second extraction module is specifically configured to extract geometric information of a scene picture to be modeled by using the first material sub-network according to the attribute information of the second sampling point to obtain an SDF value, a normal line and a volume density of the second sampling point in a shape dimension; according to the attribute information of the second sampling point, extracting roughness information of the scene picture to be modeled by utilizing a second material sub-network to obtain the roughness information of the second sampling point in the appearance dimension; according to the attribute information of the second sampling point, extracting specular reflection information of the scene picture to be modeled by utilizing a third material sub-network to obtain the specular reflection information of the second sampling point in the illumination dimension; according to the attribute information of the second sampling point, extracting diffuse reflection information of the scene picture to be modeled by utilizing a first material sub-network to obtain diffuse reflection information of the second sampling point in the appearance dimension and the illumination dimension;

Correspondingly, the second extraction unit is specifically configured to perform color extraction on the scene picture to be modeled by using a pre-trained color generation network according to the SDF value of the second sampling point in the shape dimension, the normal line, the roughness information of the second sampling point in the appearance dimension, the specular reflection information of the second sampling point in the illumination dimension, and the diffuse reflection information of the second sampling point in the appearance dimension and the illumination dimension, so as to obtain the color information of the second sampling point.

In a specific application scenario, the apparatus further includes:

the setting unit is used for obtaining influence factors of different environmental factors in the scene to be modeled on the color information before the color information of the second sampling point is obtained by utilizing a pre-trained color generation network to perform color extraction on the scene to be modeled according to the parameter information of the second sampling point on different material dimensions, and setting additional parameter information suitable for being added into the color generation network according to the influence factors so that the color information of the second sampling point output by the color generation network under the influence of the environmental factors has different color expression effects;

Correspondingly, the second extraction unit is specifically further configured to perform color extraction on the scene picture to be modeled by using a pre-trained color generation network according to the parameter information of the second sampling point in different material dimensions and the additional parameter information, so as to obtain color information of the second sampling point.

In a specific application scenario, the apparatus further includes:

the second obtaining unit is used for obtaining network input parameters with editability in the material information extraction process of the scene picture to be modeled in response to an editing instruction of the scene to be modeled after determining the three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions and the color information of the second sampling point;

the adjusting unit is used for adjusting the network input parameters with editability according to the editing instruction so as to extract information and color of the scene picture to be modeled according to the adjusted network input parameters and obtain the updated three-dimensional scene model data of the scene picture to be modeled.

In a specific application scenario, the apparatus further includes:

The third obtaining unit is used for obtaining additional parameter information suitable for being added into a color generation network in a scene to be modeled after the network input parameters with editability are adjusted according to the editing instruction;

the adjusting unit is further configured to adjust the additional parameter information according to updated mapping information and/or newly added illumination information in the scene to be modeled, so as to extract information of the scene to be modeled according to the adjusted network input parameters, and then extract color of the scene to be modeled in combination with the adjusted additional attribute parameters, thereby obtaining three-dimensional scene model data of the updated scene to be modeled.

In a specific application scenario, the apparatus further includes:

the determining unit is used for obtaining diffuse reflection illumination and specular reflection illumination in the scene to be modeled after obtaining the additional parameter information which is suitable for being added into the color generation network in the scene to be modeled, and determining updated mapping information in the scene to be modeled by replacing the environment mapping in the scene to be modeled with the target environment mapping; acquiring light source information in a scene to be modeled, and determining newly added illumination information in the scene to be modeled by modifying the light source information.

In a specific application scene, the determining unit is specifically configured to use the environmental map to represent diffuse reflection illumination in the scene to be modeled, and directly change diffuse reflection illumination in the scene to be modeled by changing the environmental map in the scene to be modeled to a target environmental map; using a network of multi-layer perceptrons to represent specular reflection illumination in the scene to be modeled, and indirectly modifying the specular reflection illumination in the scene to be modeled by replacing an environmental map in the scene to be modeled with a target environmental map; and determining updated mapping information in the scene to be molded according to the diffuse reflection illumination and the specular reflection illumination in the scene to be molded after modification.

In a specific application scenario, the determining unit is specifically further configured to obtain roughness information of a second sampling point in the to-be-modeled scenario before the network using the multi-layer perceptron represents specular reflection illumination in the to-be-modeled scenario, and the environment map in the to-be-modeled scenario is replaced by the target environment map, so that the specular reflection illumination in the to-be-modeled scenario is indirectly changed; if the roughness information of the second sampling point is smaller than a preset threshold value, the network of the multi-layer perceptron is optimized according to the target environment mapping by replacing the environment mapping in the scene to be modeled with the target environment mapping, and specular reflection illumination in the scene to be modeled is indirectly changed in the optimization process; and if the roughness information of the second sampling point is greater than or equal to a preset threshold value, indirectly changing specular reflection illumination in the scene to be modeled by changing the environment map in the scene to be modeled into a target environment map, generating a multi-layer texture map of the target environment map according to Monte Carlo sampling, and according to the multi-layer texture map of the target environment map.

In a specific application scenario, the apparatus further includes:

the first modeling unit is used for acquiring an SDF value of the second sampling point in the shape dimension from the parameter information of the second sampling point in the different material dimension after determining the three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point in the different material dimension and the color information of the second sampling point, and performing 3D modeling on the scene picture to be modeled according to the SDF value of the second sampling point in the shape dimension;

and the second modeling unit is used for carrying out color 3D modeling on the scene picture to be modeled according to the volume density of the second sampling point, the SDF value of the second sampling point in the shape dimension and the color information of the second sampling point.

It should be noted that, other corresponding descriptions of each functional unit related to the modeling apparatus for three-dimensional scene provided in this embodiment may refer to corresponding descriptions in fig. 1 to fig. 4, and are not described herein again.

Further, as a specific implementation of the method of fig. 9, an embodiment of the present application provides a rendering device of a three-dimensional scene, as shown in fig. 11, where the device includes: a fourth acquisition unit 51, a rendering unit 52.

A fourth obtaining unit 51, configured to extract the volume density of the second sampling point and the color information of the second sampling point from the three-dimensional scene model data;

a rendering unit 52, configured to perform screen pixel volume rendering and light source volume rendering on a scene picture to be modeled according to the volume density of the second sampling point and the color information of the second sampling point, where the screen pixel volume rendering is used for performing color rendering on the 3D model, and the light source volume rendering is used for performing shadow rendering on the 3D model;

the three-dimensional scene model data are obtained by using the modeling method of the three-dimensional scene.

It should be noted that, for other corresponding descriptions of each functional unit related to the rendering device for three-dimensional scene provided in this embodiment, reference may be made to the corresponding description in fig. 9, and no further description is given here.

Based on the above method shown in fig. 1-4, correspondingly, the embodiment of the application further provides a storage medium, on which a computer program is stored, which when executed by a processor, implements the above method for modeling a three-dimensional scene shown in fig. 1-4.

Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in various implementation scenarios of the present application.

Based on the method shown in fig. 1 to fig. 4 and the virtual device embodiment shown in fig. 10, in order to achieve the above objective, the embodiment of the present application further provides a three-dimensional scene modeling entity device, which may specifically be a computer, a smart phone, a tablet computer, a smart watch, a server, or a network device, where the entity device includes a storage medium and a processor; a storage medium storing a computer program; a processor for executing a computer program to implement the above-described modeling method of a three-dimensional scene as shown in fig. 1-4.

Based on the method shown in fig. 9 and the virtual device embodiment shown in fig. 11, in order to achieve the above objective, the embodiment of the present application further provides an entity device for rendering a three-dimensional scene, which may specifically be a computer, a smart phone, a tablet computer, a smart watch, a server, or a network device, where the entity device includes a storage medium and a processor; a storage medium storing a computer program; a processor for executing a computer program to implement the modeling method of a three-dimensional scene as described above and shown in fig. 9.

Optionally, the two entity devices may further include a user interface, a network interface, a camera, a Radio Frequency (RF) circuit, a sensor, an audio circuit, a WI-FI module, and so on. The user interface may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), etc., and the optional user interface may also include a USB interface, a card reader interface, etc. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), etc.

In an exemplary embodiment, referring to fig. 12, the entity device includes a communication bus, a processor, a memory, a communication interface, an input/output interface, and a display device, where each functional unit may perform communication with each other through the bus. The memory stores a computer program, and a processor for executing the program stored on the memory, and executing the modeling method of the three-dimensional scene in the above embodiment.

It will be appreciated by those skilled in the art that the modeled solid device structure of a three-dimensional scene provided in this embodiment is not limited to this solid device, and may include more or fewer components, or may combine certain components, or may be a different arrangement of components.

The storage medium may also include an operating system, a network communication module. The operating system is a program that manages the physical device hardware and software resources of the modeling of the three-dimensional scene described above, supporting the execution of information handling programs and other software and/or programs. The network communication module is used for realizing communication among all components in the storage medium and communication with other hardware and software in the information processing entity equipment.

From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. By applying the technical scheme, compared with the existing mode, the method and the device have the advantages that firstly, estimation of attribute information on different material dimensions is carried out on the scene picture to be modeled through a model network, then, estimation of color information is carried out on the scene picture to be modeled through a color generation network, the scene information to be modeled is obtained by combining the attribute information and the color information on the different material dimensions, and due to the fact that the scene information is fused with multidimensional attributes in the scene to be modeled, accurate modeling of the three-dimensional scene can be achieved without a large amount of calculation, and therefore, the shape, appearance and illumination of the 3D model are restored with high quality.

Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application. Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.

The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario. The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims

1. A method of modeling a three-dimensional scene, comprising:

according to the attribute information of the first sampling point, information extraction is carried out on a scene picture to be modeled by utilizing a pre-trained model network, so that the volume density of a second sampling point and the parameter information of the second sampling point on different material dimensions are obtained, the second sampling point is obtained by 3D space sampling of rays of a preset visual angle in the scene to be modeled, the pre-trained model network comprises a shape network and a material network, and specifically, according to the attribute information of the first sampling point, the shape network is utilized to extract the shape information of the scene picture to be modeled, so that a model expressed by SDF is obtained; generating a 3D model grid from the model represented by the SDF, and estimating attribute information of a second sampling point according to vertex data of the 3D model grid; extracting material information of a scene picture to be modeled by utilizing the material network according to the attribute information of the second sampling point to obtain the volume density of the second sampling point and the parameter information of the second sampling point on different material dimensions, wherein the volume density of the second sampling point is obtained by extracting geometric information by utilizing a first material sub-network in the material network;

According to the parameter information of the second sampling points in different material dimensions, performing color extraction on the scene picture to be modeled by utilizing a pre-trained color generation network to obtain the color information of the second sampling points, wherein the parameter information of the second sampling points in different material dimensions is used as input parameters corresponding to the pre-trained color generation network;

determining three-dimensional scene model data of a scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions and the color information of the second sampling point;

using the environment map to represent diffuse reflection illumination in the scene to be modeled, and directly changing the diffuse reflection illumination in the scene to be modeled by changing the environment map in the scene to be modeled into a target environment map; make the following stepsExpressing the specular reflection illumination in the scene to be modeled by using a network of the multi-layer perceptron, and indirectly changing the specular reflection illumination in the scene to be modeled by changing the environment map in the scene to be modeled into a target environment map; determining updated mapping information in the scene to be modeled according to the diffuse reflection illumination and the specular reflection illumination in the scene to be modeled after modification; optimizing the loss function according to the target environment map to enable the mirror illumination network Adapting to the target environment map->The loss function formula is:

where S represents the number of samples of the mesh surface and P represents the target environment mapThe number of pixels in>Representing mirrored lighting network->Roughness value representing the i-th sampling point, < +.>Mapping for the camera origin to the target environment +.>The unit vector of the j-th pixel position, here assuming the normal direction n and the camera direction +.>Are identical;

acquiring roughness information of a second sampling point in a scene to be modeled; if the roughness information of the second sampling point is smaller than a preset threshold, the indirectly changing the specular reflection illumination in the scene to be modeled by changing the environment map in the scene to be modeled to the target environment map includes: the method comprises the steps of optimizing a network of a multi-layer perceptron according to a target environment map by replacing the environment map in a scene to be modeled with the target environment map, and indirectly changing specular reflection illumination in the scene to be modeled in the optimization process; if the roughness information of the second sampling point is greater than or equal to a preset threshold, the indirectly changing the specular reflection illumination in the scene to be modeled by changing the environmental map in the scene to be modeled to a target environmental map includes: changing an environment map in a scene to be modeled into a target environment map, generating multi-layer texture mapping of the target environment map according to Monte Carlo sampling, and indirectly changing specular reflection illumination in the scene to be modeled according to the multi-layer texture mapping of the target environment map;

Specifically, prefiltering environment maps under different roughness levels are obtained through Monte Carlo sampling, and multi-layer texture mapping of the target environment map is generated, wherein the formula is as follows:

wherein,the representation comes from->Directional light, ++>Indicating the direction of incident light, n indicating the normal of the surface where the incident point is located, and J indicating the number of samples of the direction of incident light;

the multi-layer texture map is at each texture levelWith fixed roughness values, according to the roughness r and camera direction of each sampling pointThe corresponding level of specular illumination is looked up from the multi-layer texture map, at which point the re-illumination loss function becomes as follows:

where M is a pre-computed environmental multi-layer texture map,is from the camera direction->The roughness of the view is +.>The color of the light at the sample point.

2. The method according to claim 1, wherein the generating the model represented by the SDF into a 3D model mesh, estimating attribute information of a second sampling point according to vertex data of the 3D model mesh, specifically includes:

rendering the model represented by the SDF through a traveling cube to obtain a 3D model grid;

acquiring a second sampling point obtained by 3D space sampling of the 3D model grid;

estimating attribute information of a second sampling point according to vertex data of the 3D model grid;

The obtaining the second sampling point obtained by 3D space sampling of the 3D model grid comprises the following steps: selecting rays in a scene to be modeled passing through the 3D model grid as a preset view angle according to the 3D model grid; 3D space sampling is carried out on the rays of the preset visual angle, and a second sampling point is obtained;

the estimating attribute information of the second sampling point according to the vertex data of the 3D model mesh includes: acquiring a plurality of target grid vertexes adjacent to the second sampling point in the 3D model grid, and determining attribute information of the plurality of target grid vertexes according to vertex data of the 3D model grid; and carrying out weighted summation on the attribute information of the plurality of target grid vertexes through a preset estimation algorithm to obtain the attribute information of the second sampling point.

3. The method according to claim 1, wherein the texture network includes a plurality of texture proton networks for extracting different texture information, and the extracting texture information from the scene picture to be modeled by using the texture network according to the attribute information of the second sampling point, to obtain the volume density of the second sampling point and the parameter information of the second sampling point in different texture dimensions includes:

Extracting geometric information of the scene picture to be modeled by utilizing the first material sub-network according to the attribute information of the second sampling point to obtain an SDF value, a normal line and a volume density of the second sampling point in the shape dimension;

according to the attribute information of the second sampling point, extracting roughness information of the scene picture to be modeled by utilizing a second material sub-network to obtain the roughness information of the second sampling point in the appearance dimension;

according to the attribute information of the second sampling point, extracting specular reflection information of the scene picture to be modeled by utilizing a third material sub-network to obtain the specular reflection information of the second sampling point in the illumination dimension;

according to the attribute information of the second sampling point, extracting diffuse reflection information of the scene picture to be modeled by utilizing a fourth material sub-network to obtain diffuse reflection information of the second sampling point in the appearance dimension and the illumination dimension;

correspondingly, the step of performing color extraction on the scene picture to be modeled by using a pre-trained color generation network according to the parameter information of the second sampling point in different material dimensions to obtain the color information of the second sampling point comprises the following steps:

and carrying out color extraction on the scene picture to be modeled by utilizing a pre-trained color generation network according to the SDF value of the second sampling point in the shape dimension, the normal line, the roughness information of the second sampling point in the appearance dimension, the specular reflection information of the second sampling point in the illumination dimension and the diffuse reflection information of the second sampling point in the appearance dimension and the illumination dimension, so as to obtain the color information of the second sampling point.

4. The method according to claim 1, wherein before the performing color extraction on the scene picture to be modeled by using the pre-trained color generation network according to the parameter information of the second sampling point in the dimensions of different materials to obtain the color information of the second sampling point, the method further comprises:

acquiring influence factors of different environmental factors in a scene to be modeled on color information, and setting additional parameter information suitable for being added into a color generation network according to the influence factors so that the color information of a second sampling point output by the color generation network under the influence of the environmental factors has different color expression effects;

and carrying out color extraction on the scene picture to be modeled by utilizing a pre-trained color generation network according to the parameter information of the second sampling point on different material dimensions and the additional parameter information, so as to obtain the color information of the second sampling point.

5. The method according to claim 1, wherein after determining the three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point in different material dimensions, and the color information of the second sampling point, the method further comprises:

responding to an editing instruction of a scene to be modeled, and acquiring network input parameters with editability of a scene picture to be modeled in the material information extraction process;

and adjusting the network input parameters with the editability according to the editing instruction so as to extract the material information and the color information of the scene picture to be modeled according to the adjusted network input parameters and acquire the updated three-dimensional scene model data of the scene picture to be modeled.

6. The method of claim 5, wherein after said adjusting said network input parameters having editability according to said edit instruction, said method further comprises:

acquiring additional parameter information suitable for being added into a color generation network in a scene to be modeled, wherein the additional parameter information comprises re-illumination obtained by a mapping mode and re-illumination obtained by an additional light source mode;

And adjusting the additional parameter information according to updated mapping information and/or newly added illumination information in the scene to be modeled, extracting information of the scene to be modeled according to the adjusted network input parameters, and extracting the color of the scene to be modeled by combining the adjusted additional attribute parameters to obtain the three-dimensional scene model data of the updated scene to be modeled.

7. The method according to claim 6, wherein after said obtaining additional parameter information in the scene to be modeled suitable for addition to a color generation network, the method further comprises:

acquiring light source information in a scene to be modeled, and determining newly added illumination information in the scene to be modeled by modifying the light source information.

8. The method according to any one of claims 1-7, wherein after determining the three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point in different material dimensions, and the color information of the second sampling point, the method further comprises:

acquiring an SDF value of a second sampling point in a shape dimension from parameter information of the second sampling point in different material dimensions, and performing 3D modeling on a scene picture to be modeled according to the SDF value of the second sampling point in the shape dimension;

And carrying out color 3D modeling on the scene picture to be modeled according to the volume density of the second sampling point, the SDF value of the second sampling point in the shape dimension and the color information of the second sampling point.

9. A method of rendering a three-dimensional scene, comprising:

wherein the three-dimensional scene model data is obtained using the modeling method of the three-dimensional scene of any one of claims 1 to 8.

10. A modeling apparatus for a three-dimensional scene, comprising:

the first extraction unit is used for extracting information of a scene picture to be modeled by utilizing a pre-trained model network according to the attribute information of the first sampling point to obtain the volume density of a second sampling point and the parameter information of the second sampling point on different material dimensions, wherein the second sampling point is obtained by 3D space sampling of rays of a preset visual angle in the scene to be modeled, the pre-trained model network comprises a shape network and a material network, and the shape network is used for extracting the shape information of the scene picture to be modeled according to the attribute information of the first sampling point to obtain a model expressed by SDF; generating a 3D model grid from the model represented by the SDF, and estimating attribute information of a second sampling point according to vertex data of the 3D model grid; extracting material information of a scene picture to be modeled by utilizing the material network according to the attribute information of the second sampling point to obtain the volume density of the second sampling point and the parameter information of the second sampling point on different material dimensions, wherein the volume density of the second sampling point is obtained by extracting geometric information by utilizing a first material sub-network in the material network;

The second extraction unit is used for carrying out color extraction on the scene picture to be modeled by utilizing a pre-trained color generation network according to the parameter information of the second sampling point in different material dimensions to obtain the color information of the second sampling point, wherein the parameter information of the second sampling point in different material dimensions is used as an input parameter corresponding to the pre-trained color generation network;

the determining unit is used for determining three-dimensional scene model data of the scene picture to be modeled according to the volume density of the second sampling point, the parameter information of the second sampling point on different material dimensions and the color information of the second sampling point;

the changing unit is used for representing diffuse reflection illumination in the scene to be modeled by using the environment map, and changing the environment map in the scene to be modeled into a target environment map to directly change the diffuse reflection illumination in the scene to be modeled; using a network of multi-layer perceptrons to represent specular reflection illumination in the scene to be modeled, and indirectly modifying the specular reflection illumination in the scene to be modeled by replacing an environmental map in the scene to be modeled with a target environmental map; determining updated mapping information in the scene to be modeled according to the diffuse reflection illumination and the specular reflection illumination in the scene to be modeled after modification; optimizing the loss function according to the target environment map to enable the mirror illumination network Adapting to the target environment map->The loss function formula is:

the changing unit is specifically used for acquiring roughness information of a second sampling point in the scene to be modeled; if the roughness information of the second sampling point is smaller than a preset threshold, the indirectly changing the specular reflection illumination in the scene to be modeled by changing the environment map in the scene to be modeled to the target environment map includes: the method comprises the steps of optimizing a network of a multi-layer perceptron according to a target environment map by replacing the environment map in a scene to be modeled with the target environment map, and indirectly changing specular reflection illumination in the scene to be modeled in the optimization process; if the roughness information of the second sampling point is greater than or equal to a preset threshold, the indirectly changing the specular reflection illumination in the scene to be modeled by changing the environmental map in the scene to be modeled to a target environmental map includes: changing an environment map in a scene to be modeled into a target environment map, generating multi-layer texture mapping of the target environment map according to Monte Carlo sampling, and indirectly changing specular reflection illumination in the scene to be modeled according to the multi-layer texture mapping of the target environment map;

the multi-layer texture map has a fixed roughness value at each texture level, based on the roughness r and camera direction for each sample pointThe corresponding level of specular illumination is looked up from the multi-layer texture map, at which point the re-illumination loss function becomes as follows:

11. A rendering apparatus for a three-dimensional scene, comprising:

a fourth obtaining unit, configured to extract a volume density of the second sampling point and color information of the second sampling point from the three-dimensional scene model data;

12. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 8 when the computer program is executed.

13. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 8.