CN114972632A

CN114972632A - Image processing method and device based on nerve radiation field

Info

Publication number: CN114972632A
Application number: CN202210420957.6A
Authority: CN
Inventors: 高岱恒; 张鹏; 张邦; 谭平
Original assignee: Alibaba Damo Institute Hangzhou Technology Co Ltd
Current assignee: Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority date: 2022-04-21
Filing date: 2022-04-21
Publication date: 2022-08-30

Abstract

The embodiment of the specification provides an image processing method and device based on a nerve radiation field, wherein the image processing method based on the nerve radiation field comprises the following steps: determining sampling position information, sampling view angle information and a target object image in response to an image processing instruction; inputting the sampling position information, the sampling visual angle information and the target object image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, wherein the nerve radiation field model is a machine learning model; determining at least two subsets of sampling points based on the symbol distance function values corresponding to each sampling point; and rendering according to the density information and the color characteristic value of the sampling points in each sampling point subset to generate a target sub-object corresponding to each sampling point subset, and generating the target object based on each target sub-object.

Description

Image processing method and device based on nerve radiation field

Technical Field

The embodiment of the specification relates to the technical field of image processing, in particular to an image processing method based on a nerve radiation field. One or more embodiments of the present specification also relate to an image processing apparatus based on a nerve radiation field, a computing device, a computer-readable storage medium, and a computer program.

Background

With the development of computer technology, the technology in the image processing field has been developed in recent years, where the field of synthesizing virtual objects at different viewing angles is an important branch, low-cost editing and synthesizing of realistic virtual objects at different viewing angles is very important for the field of synthesizing 2D virtual objects, and 3D virtual object synthesis is a method of editing coding vectors corresponding to virtual objects based on a neural network, and during the process of synthesizing 3D virtual objects, it is necessary to ensure multi-view consistency and identity retention, that is, it is required to ensure that virtual objects have the same structure at any viewing angle and that virtual objects do not change at different viewing angles. For example, taking a human face as an example, it is to be ensured that the generated human face has the same structure at any viewing angle and is the same person at different viewing angles. However, in the process of generating a 3D virtual object by model rendering at present, a problem of inconsistent view angle often occurs, and therefore, how to solve the problem of inconsistent view angle becomes a problem to be solved urgently by technical staff.

Disclosure of Invention

In view of this, the embodiments of the present disclosure provide an image processing method based on a nerve radiation field. One or more embodiments of the present disclosure also relate to a method for generating a face based on a nerve radiation field, an image processing apparatus based on a nerve radiation field, a face generating apparatus based on a nerve radiation field, a computing device, a computer-readable storage medium, and a computer program, so as to solve technical defects in the prior art.

According to a first aspect of embodiments of the present specification, there is provided an image processing method based on a nerve radiation field, including:

determining sampling position information, sampling view angle information and a target object image in response to an image processing instruction;

inputting the sampling position information, the sampling visual angle information and the target object image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, wherein the nerve radiation field model is a machine learning model;

determining at least two subsets of sample points based on the symbol distance function values corresponding to each sample point;

and rendering according to the density information and the color characteristic value of the sampling points in each sampling point subset to generate a target sub-object corresponding to each sampling point subset, and generating the target object based on each target sub-object.

According to a second aspect of embodiments of the present specification, there is provided a face generation method based on a nerve radiation field, including:

determining sampling position information, sampling visual angle information and a face image in response to a face generation instruction;

inputting the sampling position information, the sampling visual angle information and the face image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, wherein the nerve radiation field model is a machine learning model;

and rendering and generating a face subset corresponding to each sampling point subset based on each sampling point subset, and generating a target face based on each face subset.

According to a third aspect of embodiments herein, there is provided an image processing apparatus based on a nerve radiation field, including:

a determination module configured to determine sampling position information, sampling perspective information, and a target object image in response to an image processing instruction;

the input module is configured to input the sampling position information, the sampling visual angle information and the target object image into a pre-trained nerve radiation field model for processing, and obtain density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, wherein the nerve radiation field model is a machine learning model;

a subset determination module configured to determine at least two subsets of sample points based on the symbol distance function value corresponding to each sample point;

and the rendering module is configured to render and generate a target sub-object corresponding to each sampling point subset according to the density information and the color characteristic value of the sampling points in each sampling point subset, and generate the target object based on each target sub-object.

According to a fourth aspect of embodiments herein, there is provided a face generation apparatus based on a nerve radiation field, comprising:

an image determination module configured to determine sampling position information, sampling view angle information, and a face image in response to a face generation instruction;

the model input module is configured to input the sampling position information, the sampling visual angle information and the face image into a pre-trained nerve radiation field model for processing, and obtain density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, wherein the nerve radiation field model is a machine learning model;

and the face rendering module is configured to render and generate a face subset corresponding to each sampling point subset based on each sampling point subset, and generate a target face based on each face subset.

According to a fifth aspect of embodiments herein, there is provided an augmented reality AR device or a virtual reality VR device, comprising:

a memory, a processor, and a display;

the memory is configured to store computer-executable instructions and the processor is configured to execute the computer-executable instructions, which when executed by the processor implement the steps of:

rendering according to the density information and the color characteristic value of the sampling points in each sampling point subset to generate a target sub-object corresponding to each sampling point subset, and generating the target object based on each target sub-object;

displaying the target object through a display of the augmented reality AR device or the virtual reality VR device.

According to a sixth aspect of embodiments herein, there is provided a computing device comprising a memory, a processor and computer instructions stored on the memory and executable on the processor, the processor implementing the steps of the method for image processing based on a nerve radiation field or the method for generating a human face based on a nerve radiation field when executing the computer instructions.

According to a seventh aspect of embodiments herein, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the image processing method based on a nerve radiation field or the face generation method based on a nerve radiation field.

According to an eighth aspect of embodiments of the present specification, there is provided a computer program, wherein when the computer program is executed in a computer, the computer program causes the computer to execute the steps of the image processing method based on a nerve radiation field or the face generation method based on a nerve radiation field described above.

The image processing method based on the nerve radiation field provided by the specification responds to an image processing instruction to determine sampling position information, sampling visual angle information and a target object image; inputting the sampling position information, the sampling visual angle information and the target object image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, wherein the nerve radiation field model is a machine learning model; determining at least two subsets of sample points based on the symbol distance function values corresponding to each sample point; and rendering according to the density information and the color characteristic value of the sampling points in each sampling point subset to generate a target sub-object corresponding to each sampling point subset, and generating the target object based on each target sub-object.

In an embodiment of the present specification, a mode of fusing a symbol distance function value into a nerve radiation field model (NeRF model) is implemented, density information, a color feature value, and a symbol distance function value (SDF value) of a sampling point are obtained according to sampling position information, sampling view angle information, and a target object image, each component of a virtual object is determined in a mode of the NeRF model with the SDF value set, each component is rendered, and then each component generated by rendering is spliced into the target virtual object. According to the method, the geometric attributes of the neural network for the 3D object are guaranteed through the density information and the color characteristic values, the component corresponding to each sampling point is determined through the SDF value, each component of the virtual object is generated, and the superiority and consistency of the rendering effect of the virtual object are guaranteed.

Drawings

Fig. 1 is a flowchart of an image processing method based on a nerve radiation field according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of acquiring sampling location information and sampling perspective information provided by one embodiment of the present description;

fig. 3 is a schematic diagram illustrating SDF value analysis of a virtual face according to an embodiment of the present disclosure;

fig. 4a is a flowchart illustrating a processing procedure of a method for generating a human face based on a nerve radiation field according to an embodiment of the present disclosure;

FIG. 4b is a schematic diagram of a face subset provided in an embodiment of the present specification;

fig. 5 is a schematic structural diagram of an image processing apparatus based on a nerve radiation field according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of a face generation device based on a nerve radiation field according to an embodiment of the present specification;

fig. 7 is a block diagram of a computing device according to an embodiment of the present disclosure.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, as those skilled in the art will be able to make and use the present disclosure without departing from the spirit and scope of the present disclosure.

The terminology used in the description of the one or more embodiments is for the purpose of describing the particular embodiments only and is not intended to be limiting of the description of the one or more embodiments. As used in one or more embodiments of the present specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification is intended to encompass any and all possible combinations of one or more of the associated listed items.

It will be understood that, although the terms first, second, etc. may be used herein in one or more embodiments to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first can also be referred to as a second and, similarly, a second can also be referred to as a first without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

First, the noun terms referred to in one or more embodiments of the present specification are explained.

Image synthesis: image Synthesis, an approach of editing a coding vector corresponding to a human face based on a neural network.

Neural radiation Field (Neural Radiance Field): a unique structure of volume rendering (3D rendering).

Face Semantic Segmentation (Semantic Segmentation): the human face is separated according to the hair, the skin, the background and the like.

Multi-View consistency (Multi-View consistency): the human face is guaranteed to be of the same structure under any view angle, namely the identity-preserving property (ID-preserving) is excellent.

Identity maintenance (ID-Preservation): the ability of a person to not change at different viewing angles.

In the present specification, an image processing method based on a nerve radiation field is provided. One or more embodiments of the present specification also relate to a method for generating a face based on a nerve radiation field, an image processing apparatus based on a nerve radiation field, a face generating apparatus based on a nerve radiation field, a computing device, a computer-readable storage medium, and a computer program, which are described in detail in the following embodiments one by one.

Fig. 1 shows a flowchart of an image processing method based on a nerve radiation field, which includes steps 102 to 108, according to an embodiment of the present disclosure.

Step 102: the sampling position information, the sampling view angle information, and the target object image are determined in response to the image processing instruction.

The image processing method based on the nerve radiation field adopts a nerve radiation field model, guarantees the geometric attributes of a neural network to a 3D object, and guarantees the superiority and consistency of the effect of editing and synthesizing the virtual object.

The input using the neuro-radiation field model is only pictures at different viewing angles, without camera parameters, ambient light and many of the parts involved in the 3D Mesh (3D Mesh) based rendering process: texture, lighting, material, camera position, etc.

The sampling position information and the sampling view angle information are input parameters for inputting to a Neural radiation field model, a Neural radiation field (NeRF) is a model for rendering and generating a three-dimensional scene, and a conventional NeRF is a three-dimensional scene representation and is an implicit scene representation which cannot directly see a three-dimensional model, so that the use of the NeRF is somewhat inconvenient in practical application. The image processing method based on the nerve radiation field provided by the specification adopts a nerve radiation field model to ensure the geometric attributes of a neural network to a 3D object, and simultaneously edits a virtual object through displayed synthesis.

NeRF is an abbreviation for Neural Radiance Fields. Wherein a Radiance Fields refers to a function, or map, g _θ Specifically, NeRF can be expressed by the following equation 1:

(σ，c)＝g _θ (x, d) formula 1

Wherein the inputs to the mapping are x and d, where x ∈ R ³ Is the coordinate of a three-dimensional space point, namely the sampling position information, and belongs to S ² Is the viewing angle, i.e. the sampling perspective information. The output of the mapping is σ and c, where σ ∈ R ⁺ Is density information, d ∈ S ³ And is an RGB color.

Optionally, determining the sampling position information and the sampling view angle information includes:

determining sampling position information on a preset hemispherical surface;

and determining corresponding sampling visual angle information based on the sampling position information.

Referring to fig. 2 and fig. 2 show a schematic diagram of acquiring sampling position information and sampling perspective information provided in an embodiment of the present specification, where the preset hemisphere specifically refers to a virtual hemisphere used for covering an image of a target object in NeRF, a piece of sampling position information, that is, x in the above formula 1, is randomly determined in the virtual hemisphere, and then a line of sight direction, that is, d in the above formula, is constructed based on the sampling position information.

The target object image specifically refers to an image of a three-dimensional object that is desired to be rendered in the embodiment of the present specification, and in the present specification, the target object image is a picture of the same scene taken from different positions.

In a specific embodiment provided in the present specification, taking the target object image as an image of one vase as an example, the sampling position information x, the sampling view angle information d, and the target object image P are determined in response to an image processing instruction.

Step 104: and inputting the sampling position information, the sampling visual angle information and the target object image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model.

Wherein the nerve radiation field model is a machine learning model. In the method provided by this specification, the nerve radiation field model is different from a conventional nerve radiation field model, and on the basis that the conventional nerve radiation field model outputs sampling point density information σ and a color characteristic value c, a symbol distance function value (SDF value), a symbol distance function (sign distance function), abbreviated as SDF, may also be referred to as an oriented distance function (oriented distance function), determines a distance from a point to a region boundary on a limited region in space, and defines a sign of the distance at the same time: the point is positive inside the region boundary, negative outside, and 0 when located on the boundary. According to the SDF value corresponding to each sampling point, it can be determined to which part of the virtual object each sampling point belongs. Referring to fig. 3, fig. 3 is a schematic diagram illustrating an SDF value analysis of a virtual face according to an embodiment of the present disclosure, as shown in fig. 3, a color of an upper left corner region is a face region, a color of a lower left corner region is a hair region, a color of a lower right corner region is a background region, and a color of an upper right corner region is undefined.

In the method provided by the present specification, the nerve radiation field model outputs an SDF value in addition to the density information and the color characteristic value information of the conventional nerve radiation field model, and the area corresponding to each sampling point can be determined based on the SDF value. The method is convenient for rendering the virtual object after splitting the virtual object according to the SDF value of each sampling point, and then merging the virtual object.

Specifically, the inputting the sampling position information, the sampling view angle information and the target object image into a pre-trained nerve radiation field model for processing includes:

the sampling position information, the sampling visual angle information and the target object image are input into a pre-trained nerve radiation field model;

the nerve radiation field model emits sampling point rays to sampling points of the target object image based on the sampling position information and the sampling visual angle information;

and determining the density information, the color characteristic value and the symbol distance function value of each sampling point based on the sampling point ray corresponding to each sampling point.

In practical application, NeRF provided in this specification uses a ray casting technique (ray casting) to emit sampling point rays from sampling position information to sampling points of the target object image based on the sampling view angle information, where the sampling points may be pixel points in the target object image.

Ray casting is a relatively simple implementation method of volume rendering (volume rendering), which refers to a rendering method that is drawn by projection on a 2D plane according to volume data (3D texture), where a Ray of Ray casting starts from a camera and is directed to the 3D volume texture, but this is only a determined direction of the Ray, and actually, the starting position of the Ray is a front intersection point of the Ray and the volume texture, and the ending position is a back intersection point of the volume texture.

In the process of obtaining the density information, the color characteristic value and the SDF value of each sampling point, besides using a ray projection technique, a ray polymerization technique (ray marking) is also used, and a ray of the ray marking method starts along a starting point, stops at regular intervals for calculation, and continues to advance, wherein the time interval may be fixed or unfixed.

The density information, the color characteristic value and the SDF value of each sampling point can be obtained through a light projection technology and a light polymerization technology of the NeRF model.

In a specific embodiment provided in this specification, taking a target object image as an image of a vase as an example, the sampling position information x, the sampling view angle information d, and the target object image P are input to a pre-trained nerve radiation field model, and the density information σ, the color characteristic value c, and the symbol distance function value SDF corresponding to each sampling point output by the nerve radiation field model are obtained.

Step 106: at least two subsets of sample points are determined based on the symbol distance function value corresponding to each sample point.

In the above steps, the SDF value of each sampling point is obtained, that is, which sampling point subset each sampling point belongs to can be determined according to the SDF value corresponding to each sampling point, specifically, a virtual object can be composed of a plurality of parts, for example, a photo of a person's head can be divided into a face, a hair, and a background; one vase can be divided into a vase body, a flower, a background and the like. From the SDF value of each sample point it can be determined to which part each sample point belongs.

Specifically, determining at least two subsets of sample points based on the symbol distance function value corresponding to each sample point comprises:

determining a preset symbol distance function value interval corresponding to each sampling point subset;

and determining a sampling point subset corresponding to each sampling point based on the symbol distance function value corresponding to each sampling point and each preset symbol distance function value interval.

The preset symbol distance function value interval specifically refers to an interval of the SDF value corresponding to each sampling point in each component of the virtual object, and the preset symbol distance function value interval corresponding to each component is determined according to the SDF value of the sampling point in each component of the virtual object.

Which component of the virtual object each sample point belongs to is determined from the SDF value corresponding to each sample point, for example as shown in fig. 3. The sample point of the SDF value of the sample point in the upper left corner region belongs to the face part, the sample point of the SDF value of the sample point in the lower left corner region belongs to the hair part, and the sample point of the SDF value of the sample point in the lower right corner region belongs to the background part. That is, the image shown in fig. 3 includes 3 sampling point subsets, which are a face subset, a hair subset, and a background subset.

Step 108: and rendering according to the density information and the color characteristic value of the sampling points in each sampling point subset to generate a target sub-object corresponding to each sampling point subset, and generating the target object based on each target sub-object.

After the sampling point subsets are divided, volume rendering (Volumetric rendering) can be performed on each sampling point subset according to the density information and the color characteristic value of the sampling points in each sampling point subset, so as to obtain the component of the virtual object corresponding to each sampling point subset.

Still taking fig. 3 as an example, volume rendering is performed on the sampling points in the face subset according to the density information and the color feature values, so as to obtain a face part; performing volume rendering on the sampling points in the hair subset according to the density information and the color characteristic value to obtain a hair part; and volume rendering is carried out on the sampling points in the background subset according to the density information and the color characteristic value, so that a background part can be obtained.

And after a plurality of target sub-objects are obtained, synthesizing the target sub-objects to obtain the target object.

In practical application, the step of generating a target sub-object corresponding to each sampling point subset by rendering according to the density information and the color characteristic value of the sampling points in each sampling point subset comprises the following steps:

selecting a target sampling point subset;

and rendering and generating a target sub-object corresponding to the target sampling point subset based on the density information and the color characteristic value of each sampling point in the target sampling point subset.

In practical application, a certain sampling point subset is taken as an example for explanation, namely, a target sampling point subset is determined in a plurality of sampling point subsets, a target sub-object corresponding to the target sampling point subset is generated by rendering according to density information and color characteristic values of each sampling point in the target sampling point subset, the components of the virtual object can be explicitly embodied by respectively performing volume rendering, and when a plurality of virtual objects are subsequently combined, the plurality of target sub-objects can be respectively combined. And generating a plurality of combination modes, and promoting the rendering to obtain the style of the virtual object. For example, for rendering to generate a human face, the human face a1 and the hairstyle a2 of the character a can be rendered, the human face B1 and the hairstyle B2 of the character B can be rendered, the human face a1 and the hairstyle B2 can be combined to generate the character C, and the human face B1 and the hairstyle a2 can be combined to generate the character D.

In one embodiment provided in this specification, generating a target object based on each target sub-object includes:

generating an initial target object based on each target sub-object;

and improving the resolution of the initial target object to obtain the target object.

In practical application, after each target sub-object is obtained, the target sub-objects can be spliced to generate an initial target object, the resolution of the initial target object is generally low, and the user viewing experience is poor. Specifically, the increasing the resolution of the initial target object to obtain the target object includes:

inputting the initial target object to a resolution enhancement model;

and obtaining a target object corresponding to the initial target object output by the resolution improvement model.

In practical applications, the initial target image may be input to an encoder (resolution enhancement model) using StyleGan to perform resolution enhancement, so as to obtain a high-resolution target object output by the resolution enhancement model. For example, the resolution of the initial target object is 64 × 64, and after the resolution enhancement model processing, the target object with the resolution of 512 × 512 can be obtained.

In one embodiment provided herein, the neural radiation field model is trained by:

acquiring a sample sampling position, sample sampling visual angle information and a sample object image;

inputting the sample sampling position, the sample sampling visual angle information and the sample object image into a nerve radiation field model, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sample sampling point output by the nerve radiation field model;

inputting the sample sampling position, the sample sampling visual angle information and the sample object image into an object generation model, and obtaining a sample object output by the object generation model;

determining at least two sample sampling point subsets according to density information, color characteristic values and symbol distance function values corresponding to each sample sampling point, rendering to generate a prediction sub-object corresponding to each sample sampling point subset, and generating a prediction object based on each prediction sub-object;

carrying out image segmentation on the sample object to obtain a sample sub-object corresponding to the sample object;

calculating a first loss value according to the predictor sub-object and the sample sub-object, and calculating a second loss value according to the predictor sub-object and the sample object;

and adjusting model parameters of the nerve radiation field model based on the first loss value and the second loss value, and continuing training until a model training stopping condition is reached.

In practical applications, the neural radiation field model provided by the embodiment of the present specification is different from a conventional neural radiation field model, and the neural radiation field model provided by the embodiment of the present specification outputs a sign distance function value in addition to density information and color characteristic values.

In the model training process, a sample sampling position, sample sampling visual angle information and a sample object image are input into a nerve radiation field model to be trained, and density information, a color characteristic value and a symbol distance function value corresponding to each sample sampling point output by the nerve radiation field model are obtained. Determining at least two sample sampling point subsets according to the density information, the color characteristic value and the symbol distance function value corresponding to each sample sampling point, rendering to generate a prediction sub-object corresponding to each sample sampling point subset, and generating a prediction object based on each prediction sub-object.

And inputting the sample sampling position, the sample sampling visual angle information and the sample object image into an object generation model, and obtaining a sample object output by the object generation model, wherein the object generation model is a conventional NeRF model. After the sample object is obtained, image segmentation is carried out on the sample object, and a plurality of sample sub-objects corresponding to the sample object are obtained.

And calculating a first loss value according to the prediction sub-object and the sample sub-object, calculating a second loss value according to the prediction object and the sample object, and training model parameters of the nerve radiation field model to be trained jointly according to the first loss value and the second loss value until a model training stopping condition is reached. The training stop condition of the model may be that both the first loss value and the second loss value are lower than a preset threshold, or that the training round of model training reaches a preset training round.

In practical applications, in an initial stage of training of the nerve radiation field model to be trained, a difference between a generated predicted object and a real object may be large, for example, taking generation of a face image as an example, a difference between a face image generated by the nerve radiation field model to be trained at the initial stage of training and a real face may be large, and a correction is required, based on which the method further includes:

acquiring a reference object;

performing image segmentation on the reference object to obtain a reference sub-object corresponding to the reference object;

calculating a third loss value from the reference sub-object and the predictor sub-object;

and adjusting the model parameters of the nerve radiation field model according to the third loss value.

The reference object is a real virtual object, image segmentation is carried out on the reference object, the reference object is split into a plurality of reference sub-objects, a third loss value is calculated according to the reference sub-objects and the prediction sub-objects, and model parameters of the nerve radiation field model to be trained are adjusted by combining the third loss value.

The image processing method based on the nerve radiation field provided by the specification responds to an image processing instruction to determine sampling position information, sampling visual angle information and a target object image; inputting the sampling position information, the sampling visual angle information and the target object image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, wherein the nerve radiation field model is a machine learning model; determining at least two subsets of sampling points based on the symbol distance function values corresponding to each sampling point; and rendering according to the density information and the color characteristic value of the sampling points in each sampling point subset to generate a target sub-object corresponding to each sampling point subset, and generating the target object based on each target sub-object.

In an embodiment of the present description, density information, a color feature value, and a symbol distance function value (SDF value) of a sampling point are obtained according to sampling position information, sampling perspective information, and a target object image by fusing the symbol distance function value into a nerve radiation field model (NeRF model), each component of a virtual object is determined by using the NeRF value set NeRF model, each component is rendered, and then each component generated by rendering is spliced into the target virtual object. According to the method, the geometric attributes of the neural network for the 3D object are guaranteed through the density information and the color characteristic values, the component corresponding to each sampling point is determined through the SDF value, each component of the virtual object is generated, and the superiority and consistency of the rendering effect of the virtual object are guaranteed.

Secondly, the resolution of the target object can be improved through a resolution improvement model of SytleGan.

The following will further describe the image processing method based on the nerve radiation field, by taking an application of the image processing method based on the nerve radiation field provided in this specification in rendering a human face as an example, with reference to fig. 4 a. Fig. 4a shows a processing flow chart of a method for generating a human face based on a nerve radiation field according to an embodiment of the present specification, and specific steps include step 402 to step 408.

Step 402: and determining sampling position information, sampling visual angle information and a human face image in response to the human face generation instruction.

Step 404: and inputting the sampling position information, the sampling visual angle information and the face image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model.

Wherein the nerve radiation field model is a machine learning model.

Step 406: at least two subsets of sample points are determined based on the symbol distance function value corresponding to each sample point.

Step 408: and rendering and generating a face subset corresponding to each sampling point subset based on each sampling point subset, and generating a target face based on each face subset.

Referring to fig. 4b, fig. 4b is a schematic diagram of a face subset provided in an embodiment of the present disclosure, as shown in fig. 4b, a left portion is a background set, a middle portion is a face set, and a right portion is a hair set.

determining sampling position information on a preset hemispherical surface;

Optionally, the inputting the sampling position information, the sampling view angle information, and the face image into a pre-trained nerve radiation field model for processing includes:

inputting the sampling position information, the sampling visual angle information and the face image into a pre-trained nerve radiation field model;

the nerve radiation field model emits sampling point rays to sampling points of the face image based on the sampling position information and the sampling visual angle information;

Optionally, determining at least two subsets of sampling points based on the symbol distance function value corresponding to each sampling point includes:

Optionally, rendering and generating a face subset corresponding to each sampling point subset based on each sampling point subset includes:

selecting a target sampling point subset;

and rendering and generating a target face sub-object corresponding to the target sampling point subset based on the density information and the color characteristic value of each sampling point in the target sampling point subset.

Optionally, generating the target face based on each face subset includes:

generating an initial face based on each target face sub-object;

and improving the resolution of the initial face to obtain a target face.

Optionally, the increasing the resolution of the initial face to obtain the target face includes:

inputting the initial face to a resolution improvement model;

and obtaining a target face corresponding to the initial face output by the resolution improvement model.

Optionally, the nerve radiation field model is trained by the following steps:

acquiring a sample sampling position, sample sampling visual angle information and a sample face image;

inputting the sample sampling position, the sample sampling visual angle information and the sample face image into a nerve radiation field model, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sample sampling point output by the nerve radiation field model;

inputting the sample sampling position, the sample sampling visual angle information and the sample face image into a face generation model to obtain a sample face output by the face generation model;

determining at least two sample sampling point subsets according to density information, color characteristic values and symbol distance function values corresponding to each sample sampling point, rendering to generate a predicted face sub-object corresponding to each sample sampling point subset, and generating a predicted face based on each predicted face sub-object;

carrying out image segmentation on the sample face to obtain a sample face sub-object corresponding to the sample face;

calculating a first loss value according to the predicted face sub-object and the sample face sub-object, and calculating a second loss value according to the predicted face and the sample face;

Optionally, after generating the predicted face sub-object corresponding to each sample sampling point subset through rendering, the method further includes:

acquiring a reference face;

performing image segmentation on the reference face to obtain a reference face sub-object corresponding to the reference face;

calculating a third loss value according to the reference face sub-object and the predicted face sub-object;

The method for generating the human face based on the nerve radiation field, provided by the specification, achieves the purposes that the density information, the color characteristic value and the symbol distance function value (SDF value) of a sampling point are obtained according to the sampling position information, the sampling visual angle information and the target object image by means of fusing the symbol distance function value into a nerve radiation field model (NeRF model), all components of the human face are determined by means of the NeRF model with the SDF value set, all the components are rendered respectively, and all the components generated by rendering are spliced into the human face image. According to the method, the geometric attributes of the neural network to the 3D object are guaranteed through density information and color characteristic values, the component corresponding to each sampling point is determined through the SDF value, each component of the face image is generated, and the superiority and consistency of the rendering effect of the virtual object are guaranteed.

Corresponding to the above-mentioned image processing method based on the nerve radiation field, the present specification further provides an embodiment of an image processing apparatus based on the nerve radiation field, and fig. 5 shows a schematic structural diagram of an image processing apparatus based on the nerve radiation field provided in an embodiment of the present specification. As shown in fig. 5, the apparatus includes:

a determination module 502 configured to determine sampling position information, sampling perspective information, and a target object image in response to an image processing instruction;

an input module 504, configured to input the sampling position information, the sampling view angle information, and the target object image into a pre-trained nerve radiation field model for processing, and obtain density information, a color feature value, and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, where the nerve radiation field model is a machine learning model;

a subset determination module 506 configured to determine at least two subsets of sample points based on the symbol distance function value corresponding to each sample point;

and the rendering module 508 is configured to render and generate a target sub-object corresponding to each sampling point subset according to the density information and the color feature value of the sampling points in each sampling point subset, and generate a target object based on each target sub-object.

Optionally, the determining module 502 is further configured to:

determining sampling position information on a preset hemispherical surface;

Optionally, the input module 504 is further configured to:

Optionally, the subset determining module 506 is further configured to:

Optionally, the rendering module 508 is further configured to:

selecting a target sampling point subset;

Optionally, the rendering module 508 is further configured to:

generating an initial target object based on each target sub-object;

Optionally, the rendering module 508 is further configured to:

inputting the initial target object to a resolution enhancement model;

Optionally, the apparatus further comprises a training module configured to:

inputting the sample sampling position, the sample sampling view angle information and the sample object image into an object generation model, and obtaining a sample object output by the object generation model;

Optionally, the training module is further configured to:

acquiring a reference object;

The image processing device based on the nerve radiation field provided by the specification responds to an image processing instruction to determine sampling position information, sampling visual angle information and a target object image; inputting the sampling position information, the sampling visual angle information and the target object image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, wherein the nerve radiation field model is a machine learning model; determining at least two subsets of sample points based on the symbol distance function values corresponding to each sample point; and rendering according to the density information and the color characteristic value of the sampling points in each sampling point subset to generate a target sub-object corresponding to each sampling point subset, and generating the target object based on each target sub-object.

The foregoing is a schematic scheme of an image processing apparatus based on a nerve radiation field according to the embodiment. It should be noted that the technical solution of the image processing apparatus based on the nerve radiation field is the same as that of the image processing method based on the nerve radiation field, and details of the technical solution of the image processing apparatus based on the nerve radiation field, which are not described in detail, can be referred to the description of the technical solution of the image processing method based on the nerve radiation field.

Corresponding to the above embodiment of the face generation method based on the nerve radiation field, the present specification further provides an embodiment of a face generation device based on the nerve radiation field, and fig. 6 shows a schematic structural diagram of the face generation device based on the nerve radiation field provided in an embodiment of the present specification. As shown in fig. 6, the apparatus includes:

an image determination module 602 configured to determine sampling position information, sampling view angle information, and a face image in response to a face generation instruction;

a model input module 604, configured to input the sampling position information, the sampling view angle information, and the face image into a pre-trained nerve radiation field model for processing, and obtain density information, a color feature value, and a symbol distance function value corresponding to each sampling point output by the nerve radiation field model, where the nerve radiation field model is a machine learning model;

a subset determination module 606 configured to determine at least two subsets of sample points based on the symbol distance function value corresponding to each sample point;

a face rendering module 608 configured to render, based on each subset of sample points, a subset of faces corresponding to each subset of sample points and generate a target face based on each subset of faces.

Optionally, the image determining module 602 is further configured to:

determining sampling position information on a preset hemispherical surface;

Optionally, the model input module 604 is further configured to:

the nerve radiation field model transmits sampling point rays to sampling points of the face image based on the sampling position information and the sampling visual angle information;

Optionally, the subset determining module 606 is further configured to:

Optionally, the face rendering module 608 is further configured to:

selecting a target sampling point subset;

Optionally, the face rendering module 608 is further configured to:

generating an initial face based on each target face sub-object;

and improving the resolution of the initial face to obtain a target face.

Optionally, the face rendering module 608 is further configured to:

inputting the initial face to a resolution improvement model;

Optionally, the apparatus further comprises a model training module configured to:

inputting the sample sampling position, the sample sampling visual angle information and the sample face image into a nerve radiation field model, and obtaining density information, a color characteristic value and a symbol distance function value which are output by the nerve radiation field model and correspond to each sample sampling point;

Optionally, the model training module is further configured to:

acquiring a reference face;

The human face generation device based on the nerve radiation field provided by the specification realizes that density information, color characteristic values and symbol distance function values (SDF values) of sampling points are obtained according to sampling position information, sampling visual angle information and a target object image by means of fusing the symbol distance function values into a nerve radiation field model (NeRF model), all components of a human face are determined by means of the NeRF model with the SDF value set, the components are rendered respectively, and all components generated by rendering are spliced into a human face image. According to the method, the geometric attributes of the neural network to the 3D object are guaranteed through density information and color characteristic values, the components corresponding to each sampling point are determined through the SDF value, the components of the face image are generated, and the superiority and consistency of the rendering effect of the virtual object are guaranteed.

The above is a schematic scheme of a face generation device based on a nerve radiation field according to the embodiment. It should be noted that the technical solution of the face generation device based on the nerve radiation field and the technical solution of the face generation method based on the nerve radiation field belong to the same concept, and details of the technical solution of the face generation device based on the nerve radiation field, which are not described in detail, can be referred to the description of the technical solution of the face generation method based on the nerve radiation field.

Fig. 7 illustrates a block diagram of a computing device 700 provided in accordance with an embodiment of the present specification. The components of the computing device 700 include, but are not limited to, memory 710 and a processor 720. Processor 720 is coupled to memory 710 via bus 730, and database 750 is used to store data.

Computing device 700 also includes access device 740, access device 740 enabling computing device 700 to communicate via one or more networks 760. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 740 may include one or more of any type of network interface, e.g., a Network Interface Card (NIC), wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 700, as well as other components not shown in FIG. 7, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 7 is for purposes of example only and is not limiting as to the scope of the present description. Those skilled in the art may add or replace other components as desired.

Computing device 700 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smartphone), wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 700 may also be a mobile or stationary server.

Wherein, the processor 720 implements the steps of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field when executing the computer instructions.

The above is an illustrative scheme of a computing device of the present embodiment. It should be noted that the technical solution of the computing device is the same as the technical solution of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field.

An embodiment of this specification further provides an augmented reality AR device or a virtual reality VR device, including:

a memory, a processor, and a display;

determining at least two subsets of sampling points based on the symbol distance function values corresponding to each sampling point;

The above is a schematic scheme of augmented reality AR equipment or virtual reality VR equipment of this embodiment. It should be noted that the technical solution of the augmented reality AR device or the virtual reality VR device and the above-mentioned technical solution of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field belong to the same concept, and details of the technical solution of the augmented reality AR device or the virtual reality VR device, which are not described in detail, can be referred to the above description of the technical solution of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field.

An embodiment of the present specification further provides a computer readable storage medium, which stores computer instructions, and the computer instructions, when executed by a processor, implement the steps of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field as described above.

The above is an illustrative scheme of a computer-readable storage medium of the embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field, and for details that are not described in detail in the technical solution of the storage medium, reference may be made to the description of the technical solution of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field.

An embodiment of the present specification further provides a computer program, wherein when the computer program is executed in a computer, the computer program causes the computer to execute the steps of the image processing method based on the nerve radiation field or the face generation method based on the nerve radiation field.

The above is an illustrative scheme of a computer program of the present embodiment. It should be noted that the technical solution of the computer program is the same as the technical solution of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field, and details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solution of the image processing method based on the nerve radiation field or the human face generation method based on the nerve radiation field.

The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U.S. disk, removable hard disk, magnetic diskette, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signal, telecommunications signal, and software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

It should be noted that, for the sake of simplicity, the foregoing method embodiments are described as a series of acts, but those skilled in the art should understand that the present embodiment is not limited by the described acts, because some steps may be performed in other sequences or simultaneously according to the present embodiment. Further, those skilled in the art should also appreciate that the embodiments described in this specification are preferred embodiments and that acts and modules referred to are not necessarily required for an embodiment of the specification.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

The preferred embodiments of the present specification disclosed above are intended only to aid in the description of the specification. Alternative embodiments are not exhaustive and do not limit the invention to the precise embodiments described. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the embodiments. The specification is limited only by the claims and their full scope and equivalents.

Claims

1. An image processing method based on a nerve radiation field comprises the following steps:

2. The method of claim 1, determining sampling position information, sampling view angle information, comprising:

determining sampling position information on a preset hemispherical surface;

3. The method of claim 1, inputting the sampling position information, the sampling view angle information and the target object image to a pre-trained neural radiation field model process, comprising:

4. The method of claim 1, determining at least two subsets of sample points based on the symbol distance function value corresponding to each sample point, comprising:

5. The method of claim 1, generating the target sub-object corresponding to each subset of sample points according to the density information and the color feature value of the sample points in each subset of sample points by rendering, comprising:

selecting a target sampling point subset;

6. The method of claim 1, generating a target object based on each target sub-object, comprising:

generating an initial target object based on each target sub-object;

7. The method of claim 6, increasing a resolution of the initial target object to obtain a target object, comprising:

inputting the initial target object to a resolution enhancement model;

8. The method of claim 1, the nerve radiation field model being trained by:

inputting the sample sampling position, the sample sampling visual angle information and the sample object image into a nerve radiation field model, and obtaining density information, a color characteristic value and a symbol distance function value which are output by the nerve radiation field model and correspond to each sample sampling point;

determining at least two sample sampling point subsets according to the density information, the color characteristic value and the symbol distance function value corresponding to each sample sampling point, rendering to generate a prediction sub-object corresponding to each sample sampling point subset, and generating a prediction object based on each prediction sub-object;

9. The method of claim 8, after rendering to generate predictor objects for each subset of sample points, the method further comprising:

acquiring a reference object;

10. A face generation method based on a nerve radiation field comprises the following steps:

inputting the sampling position information, the sampling visual angle information and the face image into a pre-trained nerve radiation field model for processing, and obtaining density information, a color characteristic value and a symbol distance function value which are output by the nerve radiation field model and correspond to each sampling point, wherein the nerve radiation field model is a machine learning model;

11. An image processing apparatus based on a nerve radiation field, comprising:

12. An Augmented Reality (AR) device or a Virtual Reality (VR) device comprising:

a memory, a processor, and a display;

13. A computing device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor implementing the steps of the method of any one of claims 1-9 or 10 when executing the computer instructions.

14. A computer-readable storage medium storing computer-executable instructions that, when executed by a processor, perform the steps of the method of any one of claims 1-9 or 10.