NZ794397B2

NZ794397B2 - Techniques for multi-view neural object modeling

Info

Publication number: NZ794397B2
Application number: NZ794397A
Authority: NZ
Inventors: Wang Daoye; Edward Bradley Derek; Zoss Gaspard; Fabiano Urnau Gotardo Paulo; Chandran Prashanth
Original assignee: Disney Enterprises Inc; ETH Zürich (Eidgenössische Technische Hochschule Zürich)
Filing date: 2022-11-15
Publication date: 2025-09-02

Abstract

Techniques are disclosed for generating photorealistic images of objects, such as heads, from multiple viewpoints. In some embodiments, a morphable radiance field (MoRF) model that generates images of heads includes an identity model that maps an identifier (ID) code associated with a head into two codes: a deformation ID code encoding a geometric deformation from a canonical head geometry, and a canonical ID code encoding a canonical appearance within a shape-normalized space. The MoRF model also includes a deformation field model that maps a world space position to a shape-normalized space position based on the deformation ID code. Further, the MoRF model includes a canonical neural radiance field (NeRF) model that includes a density multi-layer perceptron (MLP) branch, a diffuse MLP branch, and a specular MLP branch that output densities, diffuse colors, and specular colors, respectively. The MoRF model can be used to render images of heads from various viewpoints.

Claims

WHAT IS CLAIMED IS:

1. A computer-implemented method for rendering an image of an object, the method comprising: tracing a ray through a pixel into a virtual scene; sampling one or more positions along the ray; applying a machine learning model to the one or more positions and an identifier (ID) code associated with an object to determine, for each position included in the one or more positions, a density, a diffuse color, and a specular color; and computing a color of the pixel based on the density, the diffuse color, and the specular color corresponding to each position included in the one or more positions; wherein the machine learning model comprises an identity model that maps the ID code to (i) a deformation ID code that encodes a geometric deformation from a canonical object geometry, and (ii) a canonical ID code that encodes an appearance within a space associated with the canonical object geometry.

2. The computer-implemented method of claim 1, wherein the machine learning model comprises a neural radiance field (NeRF) model that comprises a multi-layer perceptron (MLP) trunk, a first MLP branch that computes densities, a second MLP branch that computes diffuse colors, and a third MLP branch that computes specular colors.

3. The computer-implemented method of any proceeding claim, wherein computing the color of the pixel comprises: averaging the diffuse color corresponding to each position included in the one or more positions based on the density corresponding to the position to determine an averaged diffuse color; averaging the specular color corresponding to each position included in the one or more positions based on the density corresponding to the position to determine an averaged specular color; and 1005430296 computing the color of the pixel based on the averaged diffuse color and the averaged specular color.

4. The computer-implemented method of any proceeding claim, further comprising training the machine learning model based on a set of images of one or more objects that are captured from a plurality of viewpoints.

5. The computer-implemented method of claim 4, wherein the set of images include a first set of images that include diffuse colors and specular information and a second set of images that include the diffuse colors.

6. The computer-implemented method of claim 4 or 5, wherein the machine learning model is further trained based on a generated set of images of the one or more objects from another plurality of viewpoints.

7. The computer-implemented method of any proceeding claim, further comprising fitting at least one of the ID code or the machine learning model to one or more images of another object.

8. The computer-implemented method of claim 7, further comprising fitting the at least one of the ID code or the machine learning model to geometry associated with the another object.

9. The computer-implemented method of any proceeding claim, wherein the object is a head.

10. One or more computer-readable storage media including instructions that, when executed by one or more processing units, cause the one or more processing units to perform steps for rendering an image of an object, the steps comprising: tracing a ray through a pixel into a virtual scene; sampling one or more positions along the ray; 1005430296 applying a machine learning model to the one or more positions and an identifier (ID) code associated with an object to determine, for each position included in the one or more positions, a density, a diffuse color, and a specular color; and computing a color of the pixel based on the density, the diffuse color, and the specular color corresponding to each position included in the one or more positions; wherein the machine learning model comprises an identity model that maps the ID code to (i) a deformation ID code that encodes a geometric deformation from a canonical object geometry, and (ii) a canonical ID code that encodes an appearance within a space associated with the canonical object geometry.

11. The one or more computer-readable storage media of claim 10, wherein the machine learning model comprises a neural radiance field (NeRF) model that comprises a multi-layer perceptron (MLP) trunk, a first MLP branch that computes densities, a second MLP branch that computes diffuse colors, and a third MLP branch that computes specular colors.

12. The one or more computer-readable storage media of any one of claims 10 or 11, wherein computing the color of the pixel comprises: averaging the diffuse color corresponding to each position included in the one or more positions based on the density corresponding to the position to determine an averaged diffuse color; averaging the specular color corresponding to each position included in the one or more positions based on the density corresponding to the position to determine an averaged specular color; and computing the color of the pixel based on the averaged diffuse color and the averaged specular color.

13. The one or more computer-readable storage media of any of claims 10 to 12, wherein the instructions, when executed by the one or more processing units, further 1005430296 cause the one or more processing units to perform the step of training the machine learning model based on a set of images of one or more object that are captured from a plurality of viewpoints.

14. The one or more computer-readable storage media of claim 13, wherein the set of images include a first set of images that include diffuse colors and specular information and a second set of images that include the diffuse colors.

15. The one or more computer-readable storage media of any of claims 10 to 14, wherein the instructions, when executed by the one or more processing units, further cause the one or more processing units to perform the step of fitting at least one of the ID code or the machine learning model to one or more images of another object.

16. The one or more computer-readable storage media of claim 15, wherein the instructions, when executed by the one or more processing units, further cause the one or more processing units to perform the step of fitting the at least one of the ID code or the machine learning model to geometry associated with the another object.

17. A computer-implemented method for training a machine learning model, the method comprising: receiving a first set of images of one or more object that are captured from a plurality of viewpoints; generating a second set of images of the one or more object from another plurality of viewpoints; and training, based on the first set of images and the second set of images, a machine learning model, wherein the machine learning model comprises a neural radiance field model and an identity model, and wherein the identity model maps an identifier (ID) code to (i) a deformation ID code that encodes a geometric deformation from a canonical object geometry, and (ii) a canonical ID code that encodes an appearance within a space associated with the canonical object geometry. 1005430296

18. The method of claim 17, wherein the training is based on at least one of a rendering loss, a deformation loss, a density loss, or an ID loss.