NZ794397B2 - Techniques for multi-view neural object modeling - Google Patents
Techniques for multi-view neural object modelingInfo
- Publication number
- NZ794397B2 NZ794397B2 NZ794397A NZ79439722A NZ794397B2 NZ 794397 B2 NZ794397 B2 NZ 794397B2 NZ 794397 A NZ794397 A NZ 794397A NZ 79439722 A NZ79439722 A NZ 79439722A NZ 794397 B2 NZ794397 B2 NZ 794397B2
- Authority
- NZ
- New Zealand
- Prior art keywords
- color
- code
- computer
- images
- machine learning
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/40—Filling a planar surface by adding surface attributes, e.g. colour or texture
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/04—Texture mapping
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/06—Ray-tracing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/08—Volume rendering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
Abstract
Techniques are disclosed for generating photorealistic images of objects, such as heads, from multiple viewpoints. In some embodiments, a morphable radiance field (MoRF) model that generates images of heads includes an identity model that maps an identifier (ID) code associated with a head into two codes: a deformation ID code encoding a geometric deformation from a canonical head geometry, and a canonical ID code encoding a canonical appearance within a shape-normalized space. The MoRF model also includes a deformation field model that maps a world space position to a shape-normalized space position based on the deformation ID code. Further, the MoRF model includes a canonical neural radiance field (NeRF) model that includes a density multi-layer perceptron (MLP) branch, a diffuse MLP branch, and a specular MLP branch that output densities, diffuse colors, and specular colors, respectively. The MoRF model can be used to render images of heads from various viewpoints.
Claims (18)
1. A computer-implemented method for rendering an image of an object, the method comprising: tracing a ray through a pixel into a virtual scene; sampling one or more positions along the ray; applying a machine learning model to the one or more positions and an identifier (ID) code associated with an object to determine, for each position included in the one or more positions, a density, a diffuse color, and a specular color; and computing a color of the pixel based on the density, the diffuse color, and the specular color corresponding to each position included in the one or more positions; wherein the machine learning model comprises an identity model that maps the ID code to (i) a deformation ID code that encodes a geometric deformation from a canonical object geometry, and (ii) a canonical ID code that encodes an appearance within a space associated with the canonical object geometry.
2. The computer-implemented method of claim 1, wherein the machine learning model comprises a neural radiance field (NeRF) model that comprises a multi-layer perceptron (MLP) trunk, a first MLP branch that computes densities, a second MLP branch that computes diffuse colors, and a third MLP branch that computes specular colors.
3. The computer-implemented method of any proceeding claim, wherein computing the color of the pixel comprises: averaging the diffuse color corresponding to each position included in the one or more positions based on the density corresponding to the position to determine an averaged diffuse color; averaging the specular color corresponding to each position included in the one or more positions based on the density corresponding to the position to determine an averaged specular color; and 1005430296 computing the color of the pixel based on the averaged diffuse color and the averaged specular color.
4. The computer-implemented method of any proceeding claim, further comprising training the machine learning model based on a set of images of one or more objects that are captured from a plurality of viewpoints.
5. The computer-implemented method of claim 4, wherein the set of images include a first set of images that include diffuse colors and specular information and a second set of images that include the diffuse colors.
6. The computer-implemented method of claim 4 or 5, wherein the machine learning model is further trained based on a generated set of images of the one or more objects from another plurality of viewpoints.
7. The computer-implemented method of any proceeding claim, further comprising fitting at least one of the ID code or the machine learning model to one or more images of another object.
8. The computer-implemented method of claim 7, further comprising fitting the at least one of the ID code or the machine learning model to geometry associated with the another object.
9. The computer-implemented method of any proceeding claim, wherein the object is a head.
10. One or more computer-readable storage media including instructions that, when executed by one or more processing units, cause the one or more processing units to perform steps for rendering an image of an object, the steps comprising: tracing a ray through a pixel into a virtual scene; sampling one or more positions along the ray; 1005430296 applying a machine learning model to the one or more positions and an identifier (ID) code associated with an object to determine, for each position included in the one or more positions, a density, a diffuse color, and a specular color; and computing a color of the pixel based on the density, the diffuse color, and the specular color corresponding to each position included in the one or more positions; wherein the machine learning model comprises an identity model that maps the ID code to (i) a deformation ID code that encodes a geometric deformation from a canonical object geometry, and (ii) a canonical ID code that encodes an appearance within a space associated with the canonical object geometry.
11. The one or more computer-readable storage media of claim 10, wherein the machine learning model comprises a neural radiance field (NeRF) model that comprises a multi-layer perceptron (MLP) trunk, a first MLP branch that computes densities, a second MLP branch that computes diffuse colors, and a third MLP branch that computes specular colors.
12. The one or more computer-readable storage media of any one of claims 10 or 11, wherein computing the color of the pixel comprises: averaging the diffuse color corresponding to each position included in the one or more positions based on the density corresponding to the position to determine an averaged diffuse color; averaging the specular color corresponding to each position included in the one or more positions based on the density corresponding to the position to determine an averaged specular color; and computing the color of the pixel based on the averaged diffuse color and the averaged specular color.
13. The one or more computer-readable storage media of any of claims 10 to 12, wherein the instructions, when executed by the one or more processing units, further 1005430296 cause the one or more processing units to perform the step of training the machine learning model based on a set of images of one or more object that are captured from a plurality of viewpoints.
14. The one or more computer-readable storage media of claim 13, wherein the set of images include a first set of images that include diffuse colors and specular information and a second set of images that include the diffuse colors.
15. The one or more computer-readable storage media of any of claims 10 to 14, wherein the instructions, when executed by the one or more processing units, further cause the one or more processing units to perform the step of fitting at least one of the ID code or the machine learning model to one or more images of another object.
16. The one or more computer-readable storage media of claim 15, wherein the instructions, when executed by the one or more processing units, further cause the one or more processing units to perform the step of fitting the at least one of the ID code or the machine learning model to geometry associated with the another object.
17. A computer-implemented method for training a machine learning model, the method comprising: receiving a first set of images of one or more object that are captured from a plurality of viewpoints; generating a second set of images of the one or more object from another plurality of viewpoints; and training, based on the first set of images and the second set of images, a machine learning model, wherein the machine learning model comprises a neural radiance field model and an identity model, and wherein the identity model maps an identifier (ID) code to (i) a deformation ID code that encodes a geometric deformation from a canonical object geometry, and (ii) a canonical ID code that encodes an appearance within a space associated with the canonical object geometry. 1005430296
18. The method of claim 17, wherein the training is based on at least one of a rendering loss, a deformation loss, a density loss, or an ID loss.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202163280101P | 2021-11-16 | 2021-11-16 | |
| US17/983,246 US12236517B2 (en) | 2021-11-16 | 2022-11-08 | Techniques for multi-view neural object modeling |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| NZ794397A NZ794397A (en) | 2025-05-30 |
| NZ794397B2 true NZ794397B2 (en) | 2025-09-02 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Zhu et al. | Generative adversarial frontal view to bird view synthesis | |
| CN113421328B (en) | Three-dimensional human body virtual reconstruction method and device | |
| CN113066171B (en) | A facial image generation method based on a three-dimensional facial deformation model | |
| US12236517B2 (en) | Techniques for multi-view neural object modeling | |
| Beymer et al. | Example based image analysis and synthesis | |
| Vetter | Synthesis of novel views from a single face image | |
| CN112037320B (en) | Image processing method, device, equipment and computer readable storage medium | |
| Zhu et al. | View extrapolation of human body from a single image | |
| Ma et al. | SCSCN: A separated channel-spatial convolution net with attention for single-view reconstruction | |
| CN117422829A (en) | An optimization method for face image synthesis based on neural radiation field | |
| KR102419011B1 (en) | Object recognition from images using conventional CAD models | |
| CN117095128A (en) | Priori-free multi-view human body clothes editing method | |
| CN118212337A (en) | A new viewpoint rendering method for human body based on pixel-aligned 3D Gaussian point cloud representation | |
| CN119863400A (en) | Virtual viewpoint shielding region restoration method based on super-fractal multi-mode depth fusion | |
| Wang et al. | Mirrornerf: One-shot neural portrait radiance field from multi-mirror catadioptric imaging | |
| Wang et al. | Digital twin: Acquiring high-fidelity 3D avatar from a single image | |
| CN118115354A (en) | A high-fidelity and lightweight face swapping method | |
| US20200175376A1 (en) | Learning Method, Learning Device, Program, and Recording Medium | |
| NZ794397B2 (en) | Techniques for multi-view neural object modeling | |
| Luo et al. | Robot artist performs cartoon style facial portrait painting | |
| Khan et al. | Towards monocular neural facial depth estimation: Past, present, and future | |
| Pini et al. | Learning to generate facial depth maps | |
| US20220027720A1 (en) | Method to parameterize a 3d model | |
| CN113139424A (en) | Multi-feature collaborative generation system and method for human body high-fidelity visual content | |
| Zheng et al. | Research on 3D object reconstruction based on single-view RGB image |