CN117953137A

CN117953137A - Human body re-illumination method based on dynamic surface reflection field

Info

Publication number: CN117953137A
Application number: CN202410353427.3A
Authority: CN
Inventors: 张盛平; 孙艺朋靖; 柳青林; 孟权令; 吕晓倩; 王晨阳
Original assignee: Harbin Institute of Technology Weihai
Current assignee: Harbin Institute of Technology Weihai
Priority date: 2024-03-27
Filing date: 2024-03-27
Publication date: 2024-04-30
Anticipated expiration: 2044-03-27
Also published as: CN117953137B

Abstract

The present invention discloses a human body re-lighting method based on a dynamic surface reflection field, comprising the following steps: decomposing a 4D space using multi-plane and hash representation, encoding a multi-view dynamic human body video to obtain a compact spatiotemporal position encoding; obtaining the signed distance function value, geometric features and color value of the light sampling point; obtaining the depth, normal, color and material of the corresponding pixel; modeling direct lighting, light visibility and indirect lighting; and constraining the rendered image at the same time, learning the model parameters, and obtaining a dynamic human body re-lighting video. The present invention models the human body surface reflection field by designing an efficient 4D implicit representation, overcoming the problems of large fitting errors and low freedom of motion inherent in the template-based method, and achieving accurate estimation of the dynamic human body surface reflection field. In the illumination modeling, visibility and indirect light are introduced through ray tracing, and the shading effect of secondary ejection is accurately simulated, achieving more accurate material solution and re-lighting effects.

Description

Human body re-illumination method based on dynamic surface reflection field

Technical Field

The invention relates to the technical field of dynamic three-dimensional reconstruction and inverse rendering, in particular to a human body re-illumination method based on a dynamic surface reflection field.

Background

Dynamic human body weight illumination is an important research direction based on computer vision and graphics, and the application of the dynamic human body weight illumination covers a plurality of industries such as movie production, video game development, virtual reality and the like. The core aim is to manipulate the light and shadow to achieve a natural fusion of the dynamic human body with the new lighting environment.

Conventional approaches rely on controllable illumination systems in LIGHTSTAGE and advanced camera arrays to capture accurate body reflectivity, however expensive equipment limits their widespread use. To address these limitations, existing approaches propose explicit optimization of dynamic body geometry and reflected fields under unknown constant lighting conditions. Nevertheless, achieving fine dynamic reconstruction and high quality re-illumination effects remains a significant challenge for explicit representation. Under the promotion of the development of the implicit neural scene representation technology, the realization of realistic free viewpoint rendering is possible, and the exploration of the neural inverse rendering on the static object re-illumination method is promoted. However, they are difficult to extend to dynamic scenes, subject to representation limitations of static radiation fields. In order to simulate time-varying geometry and reflected fields with complex movements, the latest approach uses deformable body templates SMPL as explicit guides for body movements to simulate body movements. Limitations in fit errors and freedom of movement inherent in template-based modeling prevent existing flows, making it difficult to reconstruct dynamic geometric details in more challenging scenarios involving loose clothing and character interactions.

Disclosure of Invention

The invention aims to provide a human body re-illumination method based on a dynamic surface reflection field, which utilizes compact space-time implicit expression to learn human body motion with high degree of freedom and realizes fine dynamic human body geometric reconstruction and material estimation. In order to model an accurate shadow effect, the method estimates direct illumination and indirect illumination simultaneously, and adopts a physical-based rendering method to realize a vivid rendering effect.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

A human body re-illumination method based on a dynamic surface reflection field comprises the following steps:

Decomposing the 4D space by utilizing multi-plane and hash representation, and encoding the input multi-view dynamic human body video by using time-space multi-plane representation to obtain compact time-space position encoding; the method comprises the following steps: the 4D space is decomposed into a compact multi-planar feature encoder and a time-aware hash encoder. In modeling, light is emitted from a camera center point to an imaging plane, light in 4D space is sampled, each light samples a certain number of points, and space-time encoding is performed for each point using the two encoders obtained above.

Inputting the space-time position codes into a geometric network to obtain a symbol distance function value and geometric characteristics of the light sampling points; the method comprises the following steps: and (3) inputting space-time position codes of the light sampling points into a multi-layer perceptron, and obtaining symbol distance function values and geometric features of the corresponding light sampling points through rendering loss fitting.

Inputting the geometric characteristics and space-time position codes of the light sampling points into a color network to obtain color values of the light sampling points; the method comprises the following steps: and splicing the space-time position codes of the light sampling points with the geometric features, inputting the space-time position codes into a multi-layer perceptron, and obtaining color values of corresponding points through rendering loss fitting.

Integrating the density, normal direction, color and material of the sampling points on each light ray by using a volume rendering technology to obtain the depth, normal direction, color and material of the corresponding pixels; thereby obtaining a depth map, a normal map, a color map and a texture map of the dynamic human body;

for modeling of illumination, the method estimates direct illumination and indirect illumination simultaneously, the direct illumination uses a spherical Gaussian function for modeling, and parameters can be compressed and optimized, so that the parameters are easy to converge; indirect light relies on the characteristics of the neural radiation field, modeling visibility and indirect illumination using ray tracing.

Determining the positions of the surface points by using the obtained depth map, and obtaining a final rendering image by using a physical-based rendering method for each surface point; the method comprises the following steps: and obtaining the spatial positions of the surface points by sampling the light rays by utilizing the depth information, and obtaining a final rendered image by using a micro-surface model to input geometry, materials, visibility and illumination through a physical-based rendering method for each surface point.

Taking the target video as a monitor, simultaneously restricting the rendering image obtained by the volume rendering and the physical-based rendering method in the steps, and learning model parameters by minimizing the restriction; the main constraint is rendering loss with target video as supervision, and the main constraint comprises smooth loss of materials and geometric constraint.

When in re-illumination, the new ambient light map is used for replacing direct illumination in illumination modeling, and a physical-based rendering method is used for synthesizing dynamic human re-illumination video under the new illumination.

The effects provided in the summary of the invention are merely effects of embodiments, not all effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:

the invention provides a human body relighting method based on a dynamic surface reflection field, which designs an efficient 4D implicit representation to model the human body surface reflection field, overcomes the problems of large fitting error and lower motion freedom inherent in the method based on a template, and realizes accurate estimation of the dynamic human body surface reflection field. In the illumination modeling, visibility and indirect light are introduced through ray tracing, so that the coloring effect of secondary ejection is accurately simulated, and more accurate material calculation and relighting effect are realized.

Drawings

FIG. 1 is a flow chart of a human body re-illumination method based on a dynamic surface reflection field.

Detailed Description

As shown in fig. 1, a human body re-illumination method based on a dynamic surface reflection field comprises the following steps:

S1, decomposing a 4D space by using multi-plane and hash representation, and encoding an input multi-view dynamic human body video by using a time-space multi-plane representation to obtain a compact space-time position code;

s2, inputting the space-time position codes into a geometric network to obtain a symbol distance function value and geometric characteristics of the light sampling points;

s3, inputting the geometric features and space-time position codes of the light sampling points into a color network to obtain color values of the light sampling points;

S4, obtaining the depth, normal direction, color and material of the corresponding pixel by using a volume rendering technology for the light sampling points; thereby obtaining a depth map, a normal map, a color map and a texture map of the dynamic human body;

S5, modeling direct illumination by using a spherical Gaussian function, and modeling light visibility and indirect illumination by using ray tracing;

s6, determining the positions of the surface points by using the obtained depth map, and obtaining a final rendering image for each surface point by using a physical-based rendering method;

S7, taking the target video as a monitor, simultaneously restraining the rendering image obtained through the volume rendering and the physical-based rendering method in the steps, and learning model parameters by minimizing the restraint;

And S8, when the human body is relight, replacing the direct illumination by using a new ambient light map to obtain a dynamic human body relight video.

In step S1, the 4D space is decomposed into a compact multi-planar feature encoder and a time-aware hash encoder. In modeling, light is emitted from a camera center point to an imaging plane, light in 4D space is sampled, each light samples a certain number of points, and space-time encoding is performed for each point using the two encoders obtained above. For each sampling point in spaceThe space-time coding can be defined as:

Wherein, Representing a multi-planar feature encoder,/>A hash encoder representing the temporal perception,Is a low-dimensional tensor decomposed from the 4D tensor,/>Representing a splice operation,/>Representing the hadamard product.

In step S2, the position codes of the light sampling points are input into a small multi-layer perceptron, and the symbol distance function values and geometric features of the corresponding light sampling points are obtained through rendering loss fitting. The process can be expressed as: Wherein, the method comprises the steps of, wherein, Is a geometric network,/>Is a sign distance function value,/>Is a geometric feature.

In step S3, the space-time position codes of the sampling points are spliced with the geometric features, the spliced space-time position codes are input into a small multi-layer perceptron, and color values of the corresponding light sampling points are obtained through rendering loss fitting. The process can be expressed as: wherein/> For color network,/>Color values for the sample points.

In step S4, the density, normal direction, color and material of the sampling points on each ray are integrated by using the volume rendering technique, so as to obtain a depth map, a normal direction map, a color map and a material map of the dynamic human body. Taking a color chart as an example, this process can be expressed as:

Wherein, Representing the camera center,/>Representing the opposite direction of the light emitted from the camera center,/>Representing transmittance,/>Representing volume density,/>For sampling point color values,/>The color of the resulting pixel value is rendered for the volume.

In step S5, for modeling of illumination, the method estimates direct illumination and indirect illumination at the same time, the direct illumination uses a spherical gaussian function for modeling, compression can optimize the parameter quantity, so that the parameter quantity is easy to converge, indirect light depends on the characteristics of a nerve radiation field, and the visibility and the indirect illumination are obtained by using a light tracking mode.

Direct illuminationCan be expressed as:

Wherein, Representing a mixed sphere gaussian function,/>Representation for lobe/>Is/are optimized for the parametersFor the total number of lobes,Is the incident direction of the light.

Indirect light relies on the characteristics of the neural radiation field, and the visibility and indirect illumination are obtained by using a light tracking modeThe concrete representation is as follows:

Wherein, For/>Location of timetable points,/>Color of pixel value obtained for volume rendering,/>For/>Transmittance of each sample point, emission direction from surface point/>The rays issued may be expressed as: /(I). In actual sampling, N (=512) points are acquired by using a discrete sampling mode, wherein the number of points is/areFor/>Sampling intervals of the sampling points.

In step S6, the spatial positions of the surface points are obtained by sampling the light using the depth information, and for each surface point, the final rendered image is obtained by using the micro-surface model to input geometry, material, visibility and illumination through a physical-based rendering method. The physics-based rendering formula is as follows:

Wherein, Is in the normal direction/>To be from/>Incident radiance of direction reception,/>Is the emergent direction/>Is made of surface material.

In step S7, taking the target video as a supervision, and simultaneously constraining the rendered image obtained by the volume rendering and the physical-based rendering method in the above steps, wherein the main constraint is the rendering loss taking the target video as the supervision, and secondly comprises the smooth loss of the material and the geometric constraint, and learning the model parameters by minimizing the constraint.

Principal constraint lossThe definition is as follows:

Wherein, Is the color resulting from volume rendering,/>For colors based on physical rendering,/>Is the true color for supervision.

In step S8, after modeling is completed, only a new ambient light map is needed to replace direct illumination during re-illumination, so as to obtain a dynamic human body re-illumination video.

While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims

1. A human body re-illumination method based on a dynamic surface reflection field, characterized in that it comprises the following steps:

S1. Decompose the 4D space using multi-plane and hash representation, encode the input multi-view dynamic human body video using spatiotemporal multi-plane representation, and obtain compact spatiotemporal position coding;

S2, input the spatiotemporal position code into the geometric network to obtain the signed distance function value and geometric features of the light sampling point;

S3, encoding the geometric features and spatiotemporal position of the light sampling point into a color network to obtain the color value of the light sampling point;

S4, using volume rendering technology to obtain the depth, normal, color and material of the corresponding pixel at the light sampling point, thereby obtaining a depth map, a normal map, a color map and a material map of the dynamic human body;

S5. Use spherical Gaussian functions to model direct lighting, and use ray tracing to model light visibility and indirect lighting;

S6, using the depth map obtained in S4 to determine the position of the surface point, and for each surface point, using a physically based rendering method to obtain a rendered image;

S7, using the target video as supervision, while constraining the rendered image obtained by volume rendering in S4 and the rendered image obtained by the physical-based rendering method in S6, and learning the model parameters by minimizing the constraints;

S8. When relighting, use the new ambient light map to replace the direct lighting to obtain a dynamic human body relighting video.

2. A human body re-illumination method based on dynamic surface reflection field as described in claim 1, characterized in that S1 specifically comprises: decomposing the 4D space into a compact multi-plane feature encoder and a time-aware hash encoder; when modeling, emitting light from the camera center point to the imaging plane, sampling each light ray in the 4D space, and using a multi-plane feature encoder and a hash encoder to perform spatiotemporal position encoding for each light sampling point.

3. A human body re-illumination method based on a dynamic surface reflection field as described in claim 1, characterized in that S2 specifically comprises: encoding the spatiotemporal position of the light sampling point into a multi-layer perceptron, and obtaining the signed distance function value and geometric features of the corresponding light sampling point through rendering loss fitting.

4. A human body re-illumination method based on a dynamic surface reflection field as described in claim 1, characterized in that S3 specifically comprises: splicing the spatiotemporal position encoding of the light sampling point with the geometric features, inputting the encoding into a multi-layer perceptron, and obtaining the color value of the corresponding light sampling point through rendering loss fitting.

5. A human body re-illumination method based on a dynamic surface reflection field as described in claim 1, characterized in that S4 specifically comprises: using volume rendering technology to integrate the density, normal, color and material of the sampling points on each light ray to obtain a depth map, normal map, color map and material map of the dynamic human body.

6. A human body re-illumination method based on a dynamic surface reflection field as described in claim 1, characterized in that the S5 is specifically as follows: the direct illumination is modeled using a spherical Gaussian function, and the amount of parameters that can be optimized is compressed to make it easy to converge; the indirect light relies on the characteristics of the neural radiation field, and uses ray tracing to obtain visibility and indirect illumination.

7. A human body re-illumination method based on a dynamic surface reflection field as described in claim 1, characterized in that S6 specifically comprises: using depth information to obtain the spatial position of the surface point by sampling light, and for each surface point, using a microsurface model to use a physically based rendering method to take geometry, material, visibility and lighting as input to obtain a final rendered image.

8. A human body re-illumination method based on a dynamic surface reflection field as described in claim 1, characterized in that, S7 specifically comprises: taking the target video as supervision, and constraining the rendered image obtained by volume rendering in S4 and the rendered image obtained by the physically based rendering method in S6, wherein the main constraint is the rendering loss supervised by the target video, and secondly includes the smoothness loss of the material and the geometric constraints.

9. A human body re-illumination method based on dynamic surface reflection field as described in claim 1, characterized in that S8 specifically includes: using a new ambient light map to replace the direct lighting in the lighting modeling, and using a physically based rendering method to synthesize a dynamic human body video under the new lighting.