CN115272565A - Head three-dimensional model reconstruction method and electronic equipment - Google Patents

Head three-dimensional model reconstruction method and electronic equipment Download PDF

Info

Publication number
CN115272565A
CN115272565A CN202210842557.4A CN202210842557A CN115272565A CN 115272565 A CN115272565 A CN 115272565A CN 202210842557 A CN202210842557 A CN 202210842557A CN 115272565 A CN115272565 A CN 115272565A
Authority
CN
China
Prior art keywords
value
loss value
head
target
dimensional model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210842557.4A
Other languages
Chinese (zh)
Inventor
赵笑晨
刘烨斌
刘帅
梁大才
王宝云
于芝涛
吴连朋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Hisense Visual Technology Co Ltd
Juhaokan Technology Co Ltd
Original Assignee
Tsinghua University
Hisense Visual Technology Co Ltd
Juhaokan Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University, Hisense Visual Technology Co Ltd, Juhaokan Technology Co Ltd filed Critical Tsinghua University
Priority to CN202210842557.4A priority Critical patent/CN115272565A/en
Publication of CN115272565A publication Critical patent/CN115272565A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/005General purpose rendering architectures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/50Depth or shape recovery
    • G06T7/55Depth or shape recovery from multiple images
    • G06T7/593Depth or shape recovery from multiple images from stereo images
    • G06T7/596Depth or shape recovery from multiple images from stereo images from three or more stereo images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Abstract

The application provides a reconstruction method of a head three-dimensional model and electronic equipment, which are used for improving the quality of the head three-dimensional model. The method comprises the following steps: aiming at any target object, inputting a multi-view RGBD image of the target object into a pre-trained head three-dimensional model reconstruction neural network to obtain a head three-dimensional model; the head three-dimensional model reconstruction neural network adopts a training mode as follows: inputting a training sample into a head three-dimensional model reconstruction neural network to obtain a predicted sdf value of each rendered image, each predicted rendered image and a head three-dimensional model, wherein the training sample comprises each rendered image of the head three-dimensional model rendered at different visual angles and illumination and a target sdf value of each rendered image; determining a target loss value by using a first intermediate loss value obtained based on each predicted rendering image and each rendering image and a second intermediate loss value obtained based on the predicted sdf value and the target sdf value; and if the target loss value is larger than the specified threshold value, finishing the training.

Description

Head three-dimensional model reconstruction method and electronic equipment
Technical Field
The present disclosure relates to the field of three-dimensional reconstruction technologies, and in particular, to a method for reconstructing a head three-dimensional model and an electronic device.
Background
With the continuous development of three-dimensional reconstruction technology in the field of computer vision, the three-dimensional reconstruction technology of human body becomes a research hotspot in the field of computer vision. How to reconstruct a head three-dimensional model with higher quality on the premise of only inputting a low-resolution image becomes an important direction with due value and research significance.
In the prior art, a method for reconstructing a head three-dimensional model based on a sparse multi-view color depth image mainly includes: reconstructing the parameterized model and reconstructing the model based on the voxel representation. The reconstruction of the parameterized model is to estimate the position of the two-dimensional joint from the picture, optimize the three-dimensional joint and the two-dimensional plane joint by the minimum projection distance to obtain parameters, and further obtain the three-dimensional model. The reconstruction of a model based on a voxel representation is performed by dividing the space into a number of small cubes, representing the three-dimensional model by whether the cube is occupied by an object or not.
However, reconstructing the parameterized model and the model based on the voxel representation results in a three-dimensional model of the head with only low resolution images input.
Disclosure of Invention
The application provides a reconstruction method of a head three-dimensional model and electronic equipment, which are used for improving the quality of the reconstructed head three-dimensional model.
In a first aspect, an embodiment of the present application provides a method for reconstructing a three-dimensional head model, where the method includes:
aiming at any one target object, inputting the acquired multi-view color depth RGBD image of the target object into a pre-trained head three-dimensional model reconstruction neural network to obtain a head three-dimensional model of the target object;
wherein the head three-dimensional model reconstruction neural network is trained by the following method:
obtaining a training sample, wherein the training sample comprises rendering images of a head three-dimensional model rendered under different visual angles and different illumination conditions and target symbol distance field sdf values corresponding to the rendering images;
inputting the training samples into a head three-dimensional model reconstruction neural network to obtain predicted sdf values corresponding to the rendered images, the rendered images and the head three-dimensional model;
obtaining a first intermediate loss value based on each predicted rendering image and each rendering image, and obtaining a second intermediate loss value based on the predicted sdf value and the target sdf value;
obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value;
and if the target loss value is larger than a specified threshold value, finishing the training of the head three-dimensional model reconstruction neural network.
A second aspect of the present application provides an electronic device comprising a processor and a memory, the processor and the memory being connected by a bus;
the memory having stored therein a computer program, the processor being configured to perform the following operations based on the computer program:
aiming at any one target object, inputting the acquired multi-view color depth RGBD image of the target object into a pre-trained head three-dimensional model reconstruction neural network to obtain a head three-dimensional model of the target object;
wherein the head three-dimensional model reconstruction neural network is trained by the following method:
obtaining a training sample, wherein the training sample comprises rendering images of a head three-dimensional model rendered under different viewing angles and different illumination conditions and target symbol distance field sdf values respectively corresponding to the rendering images;
inputting the training samples into a head three-dimensional model reconstruction neural network to obtain predicted sdf values corresponding to the rendered images, the rendered images and the head three-dimensional model;
obtaining a first intermediate loss value based on each predicted rendering image and each rendering image, and obtaining a second intermediate loss value based on the predicted sdf value and the target sdf value;
obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value;
and if the target loss value is larger than a specified threshold value, finishing the training of the head three-dimensional model reconstruction neural network.
According to a third aspect provided by an embodiment of the present application, there is provided a computer storage medium storing a computer program for executing the method according to the first aspect.
In the embodiment of the application, the rendered image obtained by using the image rendering method in the fine head three-dimensional model training sample has vivid color depth data, and then the target symbol distance field value of the rendered image and the rendered image are used for training the neural network reconstructed by the head three-dimensional model, so that the trained neural network is more accurate, a high-quality head three-dimensional model can be obtained under the condition that a low-resolution image is input into the trained neural network, and the quality of the head three-dimensional model is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 schematically illustrates an application scenario provided by an embodiment of the present application;
FIG. 2 is a flowchart illustrating a training method for reconstructing a neural network from a three-dimensional model of a head according to an embodiment of the present application;
fig. 3 schematically illustrates a structure of a codec network provided in an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating a flow chart for determining a first intermediate loss value according to an embodiment of the present application;
fig. 5 exemplarily illustrates a target pixel point diagram corresponding to a rendered image provided in the embodiment of the present application;
FIG. 6 is a schematic flow chart illustrating a method for reconstructing a three-dimensional head model according to an embodiment of the present application;
fig. 7 is a schematic diagram for exemplarily showing the shooting of the multi-view image provided by the embodiment of the application;
fig. 8 is a schematic flow chart illustrating a method for reconstructing a three-dimensional head model provided by an embodiment of the present application;
fig. 9 is a schematic structural diagram illustrating a reconstruction apparatus for a three-dimensional head model provided by an embodiment of the present application;
fig. 10 illustrates a hardware structure diagram of an electronic device according to an embodiment of the present application.
Detailed Description
To make the objects, embodiments and advantages of the present application clearer, the following is a clear and complete description of exemplary embodiments of the present application with reference to the attached drawings in exemplary embodiments of the present application, and it is apparent that the exemplary embodiments described are only a part of the embodiments of the present application, and not all of the embodiments.
All other embodiments, which can be derived by a person skilled in the art from the exemplary embodiments described herein without making any inventive step, are intended to be within the scope of the claims appended hereto. In addition, while the disclosure herein has been presented in terms of one or more exemplary examples, it should be appreciated that aspects of the disclosure may be implemented solely as a complete embodiment.
It should be noted that the brief descriptions of the terms in the present application are only for the convenience of understanding the embodiments described below, and are not intended to limit the embodiments of the present application. These terms should be understood in their ordinary and customary meaning unless otherwise indicated.
The terms "first," "second," and the like in the description and claims of this application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises" and "comprising," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or device that comprises a list of elements is not necessarily limited to those elements explicitly listed, but may include other elements not expressly listed or inherent to such product or device.
The term "module" as used herein refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the functionality associated with that element.
The idea of an embodiment of the present application is summarized below.
The related art of reconstruction of a three-dimensional model of a head in the related art obtains a three-dimensional model of a head with a low quality in a case of inputting a low resolution image.
Based on the problems in the prior art, the embodiment of the application provides a method for reconstructing a three-dimensional head model, an image rendering method is used in a fine three-dimensional head model training sample to enable an obtained rendered image to have vivid color depth data, and then a target symbol distance field value of the rendered image and the rendered image are used for training a neural network for reconstructing the three-dimensional head model, so that the trained neural network is more accurate, a high-quality three-dimensional head model can be obtained under the condition that a low-resolution image is input into the trained neural network, and the quality of the three-dimensional head model is improved. Embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic diagram illustrating an application scenario of reconstruction and enlargement of a three-dimensional head model provided by an embodiment of the present application; as shown in fig. 1, the application scenario is described with an electronic device as an example. The application scenario includes a terminal device 110, a camera 120, and a server 130. The server 130 may be implemented by a single server or may be implemented by a plurality of servers. The server 130 may be implemented by a physical server or may be implemented by a virtual server.
In one possible application scenario, for any one target object, the server 130 acquires an RGBD image of the target object captured by the camera 120; inputting the multi-view RGBD image into a pre-trained head three-dimensional model reconstruction neural network to obtain a three-dimensional model of the head of the human body corresponding to the target object; wherein the head three-dimensional model reconstruction neural network is trained by: the server 130 acquires a training sample, wherein the training sample comprises rendering images of the head three-dimensional model rendered at different viewing angles and under different illumination conditions and target symbol distance field sdf values corresponding to the rendering images; inputting the training sample into the head three-dimensional model reconstruction neural network to obtain a predicted sdf value corresponding to each rendering image, each predicted rendering image and the head three-dimensional model; obtaining a first intermediate loss value based on each predicted rendering image and each rendering image, and obtaining a second intermediate loss value based on the predicted sdf value and the target sdf value; then, the server 130 obtains a target loss value by using the first intermediate loss value and the second intermediate loss value; and if the target loss value is larger than a specified threshold value, finishing the training of the head three-dimensional model reconstruction neural network.
The server 130 and the terminal device 110 in fig. 1 may perform information interaction through a communication network, where a communication mode adopted by the communication network may be classified as a wireless communication mode or a wired communication mode.
Illustratively, the server 130 may communicate with the terminal device 110 by accessing a network through a cellular Mobile communication technology, such as, for example, a technology including a fifth Generation Mobile networks (5 g) technology.
Optionally, the server 130 may access the network via short-range Wireless communication, for example, including Wireless Fidelity (Wi-Fi) technology, to communicate with the terminal device 110.
Also, only a single terminal device 110, four cameras 120, and a single server 130 are described in detail in the description of the present application, but it should be understood by those skilled in the art that the terminal device 110, the cameras 120, and the server 130 are shown to be intended to represent the operations of the terminal device 110, the cameras 120, and the server 130 according to the technical aspects of the present application. And is not meant to imply a limitation on the number, type, or location of terminal devices 110, cameras 120, and servers 130. It should be noted that the underlying concepts of the example embodiments of the present application may not be altered if additional modules are added or removed from the illustrated environments.
Illustratively, terminal devices 110 include, but are not limited to: visual large screen, tablet computer, notebook computer, palm computer, mobile Internet Device (MID), wearable Device, virtual Reality (VR) Device, augmented Reality (AR) Device, wireless terminal Device in industrial control, wireless terminal Device in unmanned driving, wireless terminal Device in smart grid, wireless terminal Device in transportation safety, wireless terminal Device in smart city, or wireless terminal Device in smart home; the terminal device may have an associated client installed thereon, where the client may be software (e.g., a browser, short video software, etc.), or may be a web page, an applet, etc.
It should be noted that the method for reconstructing a three-dimensional head model provided by the present application is not only applicable to the application scenario shown in fig. 1, but also applicable to any device for reconstructing a three-dimensional head model.
The method for reconstructing a three-dimensional head model according to the exemplary embodiment of the present application is described below with reference to the drawings in conjunction with the application scenarios described above, and it should be noted that the application scenarios are only shown for the convenience of understanding the method and the principle of the present application, and the embodiments of the present application are not limited in this respect.
Firstly, a detailed description is given to a training method of the neural network for reconstructing the head three-dimensional model in the present application, as shown in fig. 2, a schematic flow chart of the neural network for reconstructing the head three-dimensional model includes the following steps:
step 201: obtaining a training sample, wherein the training sample comprises rendering images of a head three-dimensional model rendered under different viewing angles and different illumination conditions and target symbol distance field sdf values respectively corresponding to the rendering images;
it should be noted that: the three-dimensional model of the head in the training sample is obtained based on multi-view images of the same object taken at different angles.
In one embodiment, the target sdf value corresponding to each rendered image is determined by:
the method comprises the steps of projecting a rendering image into a three-dimensional space aiming at any one rendering image to obtain an outer surface point cloud of a head three-dimensional model corresponding to the rendering image, randomly sampling a plurality of three-dimensional points in the three-dimensional space, and obtaining a target sdf value corresponding to the rendering image based on distances between the three-dimensional points and the outer surface point cloud.
Step 202: inputting the training sample into the head three-dimensional model reconstruction neural network to obtain a predicted sdf value corresponding to each rendered image, each predicted rendered image and a head three-dimensional model;
the encoder-decoder network used by the neural network for reconstructing the head three-dimensional model in this embodiment includes a dual implicit field network, i.e., a symbol distance field network and a neural radiation field network. For example, as shown in fig. 3, which is a schematic structural diagram of a coder-decoder network, as can be seen from fig. 3, the coder-decoder network includes a neural radiation field network and a symbol distance field network, obtains a predicted sdf value of a rendered image by using the symbol distance field network, obtains a predicted rendered image through the neural radiation field, obtains a target loss value based on the predicted sdf value and the target sdf value, and the predicted rendered image and the rendered image, and adjusts a parameter of the head three-dimensional model reconstruction neural network by using the target loss value as an L1 norm, so as to obtain a trained head three-dimensional model reconstruction neural network.
Step 203: obtaining a first intermediate loss value based on each predicted rendering image and each rendering image, and obtaining a second intermediate loss value based on the predicted sdf value and the target sdf value;
the specific manner of determining the first intermediate loss value and the specific manner of determining the second intermediate loss value are described below, respectively:
1. first intermediate loss value:
as shown in fig. 4, a schematic flow chart for determining the first intermediate loss value includes the following steps:
step 401: aiming at any pixel point in the rendered image, obtaining a sub-loss value corresponding to the pixel point based on the pixel value of the pixel point and the pixel value of a target pixel point corresponding to the pixel point in the predicted rendered image, wherein the target pixel point is the pixel point with the same position coordinate as the pixel point in the rendered image in the predicted rendered image;
for example, as shown in fig. 5, the left image is a rendered image, and the right image is a predicted rendered image. The pixel point a 'in the predicted rendered image is a target pixel point of the pixel point a in the rendered image, the pixel point B' in the predicted rendered image is a target pixel point of the pixel point B in the rendered image, and so on, and the description is omitted here.
In one embodiment, step 401 may be implemented as: and determining the absolute value of the difference value between the pixel value of the pixel point and the pixel value of the target pixel point as a sub-loss value corresponding to the pixel point. Wherein the sub-loss value of the pixel point can be determined by formula (1):
S′=|PA-PB|.....(1);
wherein S' is the sub-loss value of the pixel point, PAIs the pixel value, P, of said pixel pointBAnd the pixel value of the target pixel point is obtained.
Step 402: and obtaining the first intermediate loss value by using the sub-loss value corresponding to each pixel point in the rendered image.
In one embodiment, step 402 may be embodied as: and adding the sub-loss values corresponding to the pixel points in the rendered image to obtain the first intermediate loss value.
2. Second loss value:
determining an absolute value of a difference of the predicted sdf value and the target sdf value as the second loss value.
Step 204: obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value;
in one embodiment, the target loss value is determined by: and adding the first intermediate loss value and the second intermediate loss value to obtain the target loss value.
Step 205: judging whether the target loss value is greater than a specified threshold value, if so, executing a step 207, and if not, executing a step 206;
it should be noted that: the designated threshold in this embodiment may be set according to actual situations, and the specific value of the designated threshold is not limited in this embodiment.
Step 206: after the designated parameters of the head three-dimensional model reconstruction neural network are adjusted, returning to execute the step 202;
the designated parameters of the head three-dimensional model reconstruction neural network can be adjusted according to a preset adjustment rule, wherein the preset adjustment rule can increase or decrease the designated parameters by designated values each time. The specific preset adjustment rule may be set according to actual conditions, and this embodiment is not limited herein.
Step 207: and finishing the training of the head three-dimensional model reconstruction neural network.
After the training mode of the neural network for reconstructing the head three-dimensional model is introduced, the method for reconstructing the head three-dimensional model in the present application is introduced as follows, as shown in fig. 6, which is a schematic flow chart of the method for reconstructing the head three-dimensional model, and includes the following steps:
step 601: aiming at any one target object, acquiring a multi-view color depth RGBD image of the target object;
step 602: and inputting the multi-view RGBD image into a pre-trained head three-dimensional model reconstruction neural network to obtain a three-dimensional model of the human head corresponding to the target object.
It should be noted that: in this embodiment, the multi-view RGBD image of the target object is obtained by shooting the target object by a plurality of depth cameras with different shooting angles, that is, the target object is in the center, and the depth cameras are distributed around the target object for one circle to shoot the target object. For example, as shown in fig. 7, the four depth cameras in fig. 7 capture multi-view images around the target object. Wherein the number of multi-view RGBD images and the number of depth cameras are the same. The multi-view RGBD image in the present embodiment is obtained by four-view depth cameras, however, the number of depth cameras may be set according to actual situations, and the present embodiment does not limit the number of depth cameras.
In order to further connect the technical solution in the present application, the following detailed description with reference to fig. 8 may include the following steps:
step 801: obtaining a training sample, wherein the training sample comprises rendering images of a head three-dimensional model rendered under different visual angles and different illumination conditions and target symbol distance field sdf values corresponding to the rendering images;
step 802: inputting the training sample into the head three-dimensional model reconstruction neural network to obtain a predicted sdf value corresponding to each rendered image, each predicted rendered image and a head three-dimensional model;
step 803: obtaining a first intermediate loss value based on each predicted rendering image and each rendering image, and obtaining a second intermediate loss value based on the predicted sdf value and the target sdf value;
step 804: obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value;
step 805: judging whether the target loss value is greater than a specified threshold value, if not, executing a step 806, and if so, executing a step 807;
step 806: after the appointed parameters of the head three-dimensional model reconstruction neural network are adjusted, returning to the step 802;
step 807: finishing the training of the head three-dimensional model reconstruction neural network;
step 808: aiming at any one target object, acquiring a multi-view color depth RGBD image of the target object;
step 809: and inputting the multi-view RGBD image into a pre-trained head three-dimensional model reconstruction neural network to obtain a three-dimensional model of the human head corresponding to the target object.
Based on the same inventive concept, the method for reconstructing a three-dimensional head model as described above in the present disclosure can also be implemented by a device for reconstructing a three-dimensional head model. The effect of the reconstruction of the head three-dimensional model is similar to that of the method, and is not repeated herein.
Fig. 9 is a schematic structural diagram of a device for reconstructing a three-dimensional model of a head according to an embodiment of the present disclosure.
As shown in fig. 9, the apparatus 900 for reconstructing a three-dimensional head model of the present disclosure may include an obtaining module 910 and a head three-dimensional model determining module 920.
An obtaining module 910, configured to obtain, for any target object, a multi-view color depth RGBD image of the target object;
a head three-dimensional model determining module 920, configured to input the multi-view RGBD image into a pre-trained head three-dimensional model reconstruction neural network, so as to obtain a head three-dimensional model corresponding to the target object;
wherein the head three-dimensional model reconstruction neural network is trained by the following method:
obtaining a training sample, wherein the training sample comprises rendering images of a head three-dimensional model rendered under different viewing angles and different illumination conditions and target symbol distance field sdf values respectively corresponding to the rendering images;
inputting the training samples into the head three-dimensional model reconstruction neural network to obtain predicted sdf values corresponding to the rendered images, the rendered images and the head three-dimensional model;
obtaining a first intermediate loss value based on each predicted rendering image and each rendered image, and obtaining a second intermediate loss value based on the predicted sdf value and the target sdf value;
obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value;
and if the target loss value is larger than a specified threshold value, finishing the training of the head three-dimensional model reconstruction neural network. In one embodiment, the apparatus further comprises:
a head three-dimensional model training module 930, configured to adjust a specified parameter of the head three-dimensional model reconstruction neural network if the target loss value is not greater than the specified threshold, and return to the step of inputting the training sample into the head three-dimensional model reconstruction neural network until the target loss value is greater than the specified threshold, and then end training of the head three-dimensional model reconstruction neural network.
In an embodiment, the head three-dimensional model determining module 920 executes the rendering based on the predicted rendering images and the rendering images to obtain a first intermediate loss value, specifically configured to:
aiming at any pixel point in the rendered image, obtaining a sub-loss value corresponding to the pixel point based on the pixel value of the pixel point and the pixel value of a target pixel point corresponding to the pixel point in the predicted rendered image, wherein the target pixel point is the pixel point with the same position coordinate in the predicted rendered image and the same position coordinate in the rendered image of the pixel point; and the number of the first and second antennas is increased,
and obtaining the first intermediate loss value by using the sub-loss value corresponding to each pixel point in the rendered image.
In an embodiment, the head three-dimensional model determining module 920 executes the pixel value based on the pixel point and the pixel value of the target pixel point corresponding to the pixel point in the prediction rendering image to obtain a sub-loss value corresponding to the pixel point, and is specifically configured to:
determining the absolute value of the difference value between the pixel value of the pixel point and the pixel value of the target pixel point as a sub-loss value corresponding to the pixel point;
the obtaining the first intermediate loss value by using the sub-loss value corresponding to each pixel point in the rendered image includes:
and adding the sub-loss values corresponding to the pixel points in the rendered image to obtain the first intermediate loss value.
In an embodiment, the head three-dimensional model determining module 920 executes the obtaining of the second intermediate loss value based on the predicted sdf value and the target sdf value, specifically to:
determining an absolute value of a difference of the predicted sdf value and the target sdf value as the second loss value;
obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value, including:
and adding the first intermediate loss value and the second intermediate loss value to obtain the target loss value.
After a method and an apparatus for reconstructing a three-dimensional model of a head according to an exemplary embodiment of the present invention are introduced, an electronic device according to another exemplary embodiment of the present invention is introduced.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
In some possible embodiments, an electronic device in accordance with the present invention may include at least one processor, and at least one computer storage medium. The computer storage medium stores therein program code, which, when executed by a processor, causes the processor to perform the steps of the method for reconstructing a three-dimensional model of a head according to various exemplary embodiments of the present invention described above in this specification. For example, the processor may perform steps 601-602 as shown in FIG. 6.
An electronic device 1000 according to this embodiment of the invention is described below with reference to fig. 10. The electronic device 1000 shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 10, the electronic apparatus 1000 is represented in the form of a general electronic apparatus. The components of the electronic device 1000 may include, but are not limited to: the at least one processor 1001, the at least one computer storage medium 1002, and the bus 1003 connecting the various system components (including the computer storage medium 1002 and the processor 1001).
Bus 1003 represents one or more of any of several types of bus structures, including a computer storage media bus or computer storage media controller, a peripheral bus, a processor, or a local bus using any of a variety of bus architectures.
The computer storage media 1002 may include readable media in the form of volatile computer storage media, such as random access computer storage media (RAM) 1021 and/or cache storage media 1022, and may further include read-only computer storage media (ROM) 1023.
Computer storage medium 1002 may also include a program/utility 1025 having a set (at least one) of program modules 1024, such program modules 1024 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The electronic device 1000 may also communicate with one or more external devices 1004 (e.g., keyboard, pointing device, etc.), with one or more devices that enable a user to interact with the electronic device 1000, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 1000 to communicate with one or more other electronic devices. Such communication may occur via input/output (I/O) interface 1005. Also, the electronic device 1000 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network such as the Internet) via the network adapter 1006. As shown, the network adapter 1006 communicates with the other modules for the electronic device 1000 over a bus 1003. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 1000, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
In some possible embodiments, the various aspects of a method for reconstructing a three-dimensional model of a head provided by the present invention may also be implemented in the form of a program product, which includes program code for causing a computer device to perform the steps in the method for reconstructing a three-dimensional model of a head according to various exemplary embodiments of the present invention described above in this specification, when the program product is run on the computer device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable diskette, a hard disk, a random access computer storage media (RAM), a read-only computer storage media (ROM), an erasable programmable read-only computer storage media (EPROM or flash memory), an optical fiber, a portable compact disc read-only computer storage media (CD-ROM), an optical computer storage media piece, a magnetic computer storage media piece, or any suitable combination of the foregoing.
The program product for reconstruction of a three-dimensional model of a head of an embodiment of the present invention may employ a portable compact disk read-only computer storage medium (CD-ROM) and include program code, and may be executable on an electronic device. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the consumer electronic device, partly on the consumer electronic device, as a stand-alone software package, partly on the consumer electronic device and partly on a remote electronic device, or entirely on the remote electronic device or server. In the case of remote electronic devices, the remote electronic devices may be connected to the consumer electronic device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external electronic device (for example, through the internet using an internet service provider).
It should be noted that although several modules of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the modules described above may be embodied in one module according to embodiments of the invention. Conversely, the features and functions of one module described above may be further divided into embodiments by a plurality of modules.
Moreover, while the operations of the method of the invention are depicted in the drawings in a particular order, this does not require or imply that the operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk computer storage media, CD-ROMs, optical computer storage media, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable computer storage medium that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable computer storage medium produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method of reconstructing a three-dimensional model of a head, the method comprising:
aiming at any one target object, inputting a multi-view color depth RGBD image of the target object into a pre-trained head three-dimensional model reconstruction neural network to obtain a head three-dimensional model of the target object;
wherein the head three-dimensional model reconstruction neural network is trained by the following method:
obtaining a training sample, wherein the training sample comprises rendering images of a head three-dimensional model rendered under different viewing angles and different illumination conditions and target symbol distance field sdf values respectively corresponding to the rendering images;
inputting the training samples into a head three-dimensional model reconstruction neural network to obtain predicted sdf values corresponding to the rendered images, the rendered images and the head three-dimensional model;
obtaining a first intermediate loss value based on each predicted rendering image and each rendering image, and obtaining a second intermediate loss value based on the predicted sdf value and the target sdf value;
obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value;
and if the target loss value is larger than a specified threshold value, finishing the training of the head three-dimensional model reconstruction neural network.
2. The method of claim 1, further comprising:
if the target loss value is not larger than the designated threshold value, the designated parameters of the head three-dimensional model reconstruction neural network are adjusted, the step of inputting the training samples into the head three-dimensional model reconstruction neural network is returned, and the training of the head three-dimensional model reconstruction neural network is finished until the target loss value is larger than the designated threshold value.
3. The method of claim 1, wherein obtaining a first intermediate penalty value based on the predicted rendered images and the rendered images comprises:
aiming at any pixel point in the rendered image, obtaining a sub-loss value corresponding to the pixel point based on the pixel value of the pixel point and the pixel value of a target pixel point corresponding to the pixel point in the predicted rendered image, wherein the target pixel point is the pixel point with the same position coordinate in the predicted rendered image as the position coordinate of the pixel point in the rendered image; and the number of the first and second electrodes,
and obtaining the first intermediate loss value by using the sub-loss value corresponding to each pixel point in the rendered image.
4. The method of claim 3, wherein obtaining the sub-loss value corresponding to the pixel point based on the pixel value of the pixel point and a pixel value of a target pixel point corresponding to the pixel point in the prediction rendered image comprises:
determining the absolute value of the difference value between the pixel value of the pixel point and the pixel value of the target pixel point as a sub-loss value corresponding to the pixel point;
the obtaining the first intermediate loss value by using the sub-loss value corresponding to each pixel point in the rendered image includes:
and adding the sub-loss values corresponding to the pixel points in the rendered image to obtain the first intermediate loss value.
5. The method of claim 1, wherein deriving a second median loss value based on the predicted sdf value and the target sdf value comprises:
determining an absolute value of a difference of the predicted sdf value and the target sdf value as the second loss value;
obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value, including:
and adding the first intermediate loss value and the second intermediate loss value to obtain the target loss value.
6. An electronic device comprising a processor and a memory, the processor and the memory being connected by a bus;
the memory having stored therein a computer program, the processor being configured to perform the following operations based on the computer program:
aiming at any one target object, inputting a multi-view color depth RGBD image of the target object into a pre-trained head three-dimensional model reconstruction neural network to obtain a head three-dimensional model of the target object;
wherein the head three-dimensional model reconstruction neural network is trained by the following method:
obtaining a training sample, wherein the training sample comprises rendering images of a head three-dimensional model rendered under different visual angles and different illumination conditions and target symbol distance field sdf values corresponding to the rendering images;
inputting the training samples into a head three-dimensional model reconstruction neural network to obtain predicted sdf values corresponding to the rendered images, the rendered images and the head three-dimensional model;
obtaining a first intermediate loss value based on each predicted rendering image and each rendering image, and obtaining a second intermediate loss value based on the predicted sdf value and the target sdf value;
obtaining a target loss value by using the first intermediate loss value and the second intermediate loss value;
and if the target loss value is larger than a specified threshold value, finishing the training of the head three-dimensional model reconstruction neural network.
7. The electronic device of claim 6, wherein the processor is further configured to:
if the target loss value is not larger than the designated threshold value, the designated parameters of the head three-dimensional model reconstruction neural network are adjusted, the step of inputting the training samples into the head three-dimensional model reconstruction neural network is returned, and the training of the head three-dimensional model reconstruction neural network is finished until the target loss value is larger than the designated threshold value.
8. The electronic device of claim 6, wherein the processor executes the deriving a first intermediate loss value based on the predicted rendered images and the rendered images, and is specifically configured to:
aiming at any pixel point in the rendered image, obtaining a sub-loss value corresponding to the pixel point based on the pixel value of the pixel point and the pixel value of a target pixel point corresponding to the pixel point in the predicted rendered image, wherein the target pixel point is the pixel point with the same position coordinate in the predicted rendered image as the position coordinate of the pixel point in the rendered image; and the number of the first and second antennas is increased,
and obtaining the first intermediate loss value by using the sub-loss value corresponding to each pixel point in the rendered image.
9. The electronic device according to claim 8, wherein the processor executes the obtaining of the sub-loss value corresponding to the pixel point based on the pixel value of the pixel point and a pixel value of a target pixel point corresponding to the pixel point in the prediction rendered image, and is specifically configured to:
determining the absolute value of the difference value between the pixel value of the pixel point and the pixel value of the target pixel point as a sub-loss value corresponding to the pixel point;
the processor executes the sub-loss values corresponding to the pixel points in the rendered image to obtain the first intermediate loss value, and is specifically configured to:
and adding the sub-loss values corresponding to the pixel points in the rendered image to obtain the first intermediate loss value.
10. The electronic device of claim 6, wherein the processor performs the deriving a second median loss value based on the predicted sdf value and the target sdf value, and is further configured to:
determining an absolute value of a difference of the predicted sdf value and the target sdf value as the second loss value;
the processor executes the utilizing the first intermediate loss value and the second intermediate loss value to obtain a target loss value, including:
and adding the first intermediate loss value and the second intermediate loss value to obtain the target loss value.
CN202210842557.4A 2022-07-18 2022-07-18 Head three-dimensional model reconstruction method and electronic equipment Pending CN115272565A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210842557.4A CN115272565A (en) 2022-07-18 2022-07-18 Head three-dimensional model reconstruction method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210842557.4A CN115272565A (en) 2022-07-18 2022-07-18 Head three-dimensional model reconstruction method and electronic equipment

Publications (1)

Publication Number Publication Date
CN115272565A true CN115272565A (en) 2022-11-01

Family

ID=83767862

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210842557.4A Pending CN115272565A (en) 2022-07-18 2022-07-18 Head three-dimensional model reconstruction method and electronic equipment

Country Status (1)

Country Link
CN (1) CN115272565A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116434146A (en) * 2023-04-21 2023-07-14 河北信服科技有限公司 Three-dimensional visual integrated management platform
CN117274642A (en) * 2023-09-20 2023-12-22 肇庆医学高等专科学校 Network image data acquisition and analysis method and system
CN117745924A (en) * 2024-02-19 2024-03-22 北京渲光科技有限公司 Neural rendering method, system and equipment based on depth unbiased estimation

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116434146A (en) * 2023-04-21 2023-07-14 河北信服科技有限公司 Three-dimensional visual integrated management platform
CN116434146B (en) * 2023-04-21 2023-11-03 河北信服科技有限公司 Three-dimensional visual integrated management platform
CN117274642A (en) * 2023-09-20 2023-12-22 肇庆医学高等专科学校 Network image data acquisition and analysis method and system
CN117274642B (en) * 2023-09-20 2024-03-26 肇庆医学高等专科学校 Network image data acquisition and analysis method and system
CN117745924A (en) * 2024-02-19 2024-03-22 北京渲光科技有限公司 Neural rendering method, system and equipment based on depth unbiased estimation

Similar Documents

Publication Publication Date Title
US11557085B2 (en) Neural network processing for multi-object 3D modeling
CN109791697B (en) Predicting depth from image data using statistical models
US11830211B2 (en) Disparity map acquisition method and apparatus, device, control system and storage medium
CN115272565A (en) Head three-dimensional model reconstruction method and electronic equipment
CN113811920A (en) Distributed pose estimation
CN108491763B (en) Unsupervised training method and device for three-dimensional scene recognition network and storage medium
KR20220029335A (en) Method and apparatus to complement the depth image
US20230419521A1 (en) Unsupervised depth prediction neural networks
EP3872760A2 (en) Method and apparatus of training depth estimation network, and method and apparatus of estimating depth of image
US20220358675A1 (en) Method for training model, method for processing video, device and storage medium
Poiesi et al. Cloud-based collaborative 3D reconstruction using smartphones
CN115294268A (en) Three-dimensional model reconstruction method of object and electronic equipment
CN115690382A (en) Training method of deep learning model, and method and device for generating panorama
CN114677422A (en) Depth information generation method, image blurring method and video blurring method
CN115661336A (en) Three-dimensional reconstruction method and related device
CN116630514A (en) Image processing method, device, computer readable storage medium and electronic equipment
CN110827341A (en) Picture depth estimation method and device and storage medium
CN109816791B (en) Method and apparatus for generating information
CN115527011A (en) Navigation method and device based on three-dimensional model
CN113761965B (en) Motion capture method, motion capture device, electronic equipment and storage medium
CN114494574A (en) Deep learning monocular three-dimensional reconstruction method and system based on multi-loss function constraint
CN110545373B (en) Spatial environment sensing method and device
CN116246026B (en) Training method of three-dimensional reconstruction model, three-dimensional scene rendering method and device
CN116188698B (en) Object processing method and electronic equipment
CN116681818B (en) New view angle reconstruction method, training method and device of new view angle reconstruction network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination