CN113593001A

CN113593001A - Target object three-dimensional reconstruction method and device, computer equipment and storage medium

Info

Publication number: CN113593001A
Application number: CN202110167508.0A
Authority: CN
Inventors: 卢湖川; 陈建川; 张莹
Original assignee: Dalian University of Technology; Tencent Technology Shenzhen Co Ltd
Current assignee: Dalian University of Technology; Tencent Technology Shenzhen Co Ltd
Priority date: 2021-02-07
Filing date: 2021-02-07
Publication date: 2021-11-02

Abstract

The present application relates to artificial intelligence and computer vision techniques, and to a method, apparatus, computer device and storage medium for three-dimensional reconstruction of a target object. The method comprises the following steps: acquiring a two-dimensional image including a target object; creating a three-dimensional parameterized model of the target object from the two-dimensional image; generating a three-dimensional continuous surface of the target object from the two-dimensional image; a three-dimensional continuous surface that is a three-dimensional surface obtained by continuously representing a surface of the target object; performing mesh division on the three-dimensional continuous surface to obtain a three-dimensional mesh surface; and adjusting model parameters of the parameterized model to perform registration processing on a grid surface in the parameterized model and the three-dimensional grid surface to obtain a final target parameterized model of the target object. The method can improve the accuracy of three-dimensional reconstruction.

Description

Target object three-dimensional reconstruction method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer vision and artificial intelligence, and in particular, to a method and an apparatus for three-dimensional reconstruction of a target object, a computer device, and a storage medium.

Background

With the technical development in the field of computer vision and artificial intelligence, scenes for three-dimensional reconstruction of a target object are infinite, such as: the three-dimensional reconstruction is carried out on the human body, and the three-dimensional reconstruction has very important value in application scenes such as human body animation, virtual reality, games and the like. There are various methods for three-dimensional reconstruction of a target object, and three-dimensional reconstruction based on a two-dimensional image of the target object is one of the methods.

At present, a three-dimensional reconstruction method based on a two-dimensional image of a target object generally reconstructs a parameterized model of the target object according to the two-dimensional image. However, due to the limitation of the parameterized model, the reconstructed parameterized model of the target object has limited shape expression on the target object, and the problem of lack of detailed information often exists, which results in inaccurate three-dimensional reconstruction result of the target object.

Disclosure of Invention

In view of the above, it is necessary to provide a method, an apparatus, a computer device and a storage medium for three-dimensional reconstruction of a target object, which can improve accuracy.

A method of three-dimensional reconstruction of a target object, the method comprising:

acquiring a two-dimensional image including a target object;

creating a three-dimensional parameterized model of the target object from the two-dimensional image;

generating a three-dimensional continuous surface of the target object from the two-dimensional image; a three-dimensional continuous surface that is a three-dimensional surface obtained by continuously representing a surface of the target object;

performing mesh division on the three-dimensional continuous surface to obtain a three-dimensional mesh surface;

and adjusting model parameters of the parameterized model to perform registration processing on a grid surface in the parameterized model and the three-dimensional grid surface to obtain a final target parameterized model of the target object.

In one embodiment, the method further comprises:

adding an offset parameter into the parameterized model to obtain a deformable parameterized model;

the adjusting model parameters of the parameterized model to register a mesh surface in the parameterized model with the three-dimensional mesh surface includes:

adjusting offset parameters corresponding to surface vertices of the deformable parameterized model to register mesh surfaces in the parameterized model with the three-dimensional mesh surface.

In one embodiment, the model parameters of the deformable parameterized model further comprise global feature parameters, pose parameters, and shape parameters;

the adjusting offset parameters corresponding to the surface vertices of the deformable parameterized model to register the mesh surface in the parameterized model with the three-dimensional mesh surface comprises:

adjusting global feature parameters of the deformable parameterized model to depth register mesh surfaces in the parameterized model with the three-dimensional mesh surface;

adjusting the posture parameters and the shape parameters of the parameterized model after the depth registration so as to perform registration of the mesh surface in the parameterized model and the three-dimensional mesh surface;

and adjusting offset parameters corresponding to surface vertexes of the coincident and registered parameterized model so as to register the characteristic parts of the mesh surface in the parameterized model and the three-dimensional mesh surface.

In one embodiment, the adjusting offset parameters corresponding to the surface vertices of the deformable parameterized model to perform a registration process of the mesh surface in the parameterized model with the three-dimensional mesh surface includes:

obtaining a surface vertex semantic segmentation result of the parameterized model;

determining surface vertexes with severe change of geometric shape in the deformable parameterized model according to the semantic segmentation result of the surface vertexes;

fixing the offset parameters corresponding to the determined surface vertices, and adjusting the offset parameters corresponding to the surface vertices in the deformable parameterized model except the determined surface vertices, so as to perform registration processing on the mesh surface in the parameterized model and the three-dimensional mesh surface.

In one embodiment, the two-dimensional image is a single-frame two-dimensional image; the parameterized model created according to the single-frame two-dimensional image is a first parameterized model;

the obtaining of the surface vertex semantic segmentation result of the parameterized model comprises:

acquiring a multi-frame two-dimensional image comprising the target object;

acquiring a second parameterized model of the target object, which is correspondingly established according to each two-dimensional image in the plurality of frames of two-dimensional images respectively;

semantic segmentation is carried out on each frame of two-dimensional image in the multi-frame of two-dimensional image to obtain a surface vertex semantic segmentation result of the corresponding second parameterized model;

and determining the surface vertex semantic segmentation result of the first parameterized model according to the surface vertex semantic segmentation result of each second parameterized model.

In one embodiment, after determining the surface vertex semantic segmentation result of the first parameterized model according to the surface vertex semantic segmentation result of each of the second parameterized models, the method further comprises:

determining invisible surface vertices of undetermined semantic segmentation results in the first parameterized model;

determining visible surface vertexes within a preset neighborhood range of the invisible surface vertexes;

and determining the semantic segmentation result of the invisible surface vertex according to the determined semantic segmentation result of the visible surface vertex.

In one embodiment, the adjusting model parameters of the parameterized model to register a mesh surface in the parameterized model with the three-dimensional mesh surface includes:

obtaining a target loss function of a plurality of constraints; the target loss function comprises a grid difference loss function, a surface topological structure loss function and a deformation loss function;

iteratively adjusting model parameters of the parameterized model toward minimizing the objective loss function to register a mesh surface in the parameterized model with the three-dimensional mesh surface.

the method further comprises the following steps:

acquiring a multi-frame two-dimensional image comprising the target object;

mapping points corresponding to the target object in the corresponding two-dimensional image to a texture space according to texture coordinates corresponding to the surface vertex of each second parameterized model to obtain initial texture maps of the target object corresponding to each two-dimensional image frame;

fusing the initial texture maps to obtain a texture map of the target object;

and performing texture rendering on the target parameterized model according to the texture map.

In one embodiment, the fusing the initial texture maps to obtain the texture map of the target object includes:

determining a fusion sequence corresponding to each frame of two-dimensional image according to the root node direction of each second parameterized model;

and fusing the initial texture maps corresponding to the two-dimensional images of the frames according to the fusion sequence to obtain the texture map of the target object.

In one embodiment, the method further comprises:

acquiring visibility maps of the target objects corresponding to the two-dimensional images of the frames respectively;

the fusing the initial texture maps corresponding to the two-dimensional images of the frames according to the fusion sequence to obtain the texture map of the target object includes:

according to the fusion sequence, selecting a current two-dimensional image from a first two-dimensional image, and generating a texture map corresponding to the current two-dimensional image according to an initial texture map and a visibility map corresponding to the current two-dimensional image;

fusing the texture map corresponding to the current two-dimensional image with the accumulated fused texture map;

after fusion, taking the next two-dimensional image as the current two-dimensional image according to the fusion sequence, iteratively returning to the step of generating the texture map corresponding to the current two-dimensional image according to the initial texture map and the visibility map corresponding to the current two-dimensional image so as to continue execution until iteration is stopped, and obtaining the texture map of the target object.

In one embodiment, the obtaining the visibility maps corresponding to the two-dimensional images of the frames includes:

generating a normal vector map corresponding to the corresponding two-dimensional image according to the normal vector of each second parameterized model surface;

and for each frame of two-dimensional image, determining the visibility of each point of the target object in the normal vector map according to the degree of closeness between the normal vector direction in the normal vector map corresponding to the two-dimensional image and the shooting direction of the two-dimensional image, and obtaining the visibility map corresponding to the two-dimensional image.

In one embodiment, the target parameterized model is a drivable parameterized model;

the method further comprises the following steps:

acquiring action parameters;

and substituting the action parameters into the target parameterized model to drive the target parameterized model to execute corresponding actions.

An apparatus for three-dimensional reconstruction of a target object, the apparatus comprising:

an image acquisition module for acquiring a two-dimensional image including a target object;

a parametric model creation module for creating a three-dimensional parametric model of the target object from the two-dimensional image;

a continuous surface generation module for generating a three-dimensional continuous surface of the target object from the two-dimensional image; a three-dimensional continuous surface that is a three-dimensional surface obtained by continuously representing a surface of the target object;

the mesh division module is used for carrying out mesh division on the three-dimensional continuous surface to obtain a three-dimensional mesh surface;

and the model parameter adjusting module is used for adjusting the model parameters of the parameterized model so as to perform registration processing on the grid surface in the parameterized model and the three-dimensional grid surface to obtain the final target parameterized model of the target object.

A computer device comprising a memory and a processor, the memory having stored therein a computer program, which, when executed by the processor, causes the processor to perform the steps of the method for three-dimensional reconstruction of a target object according to embodiments of the present application.

A computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, causes the processor to perform the steps of the method for three-dimensional reconstruction of a target object according to embodiments of the present application.

A computer program product or computer program comprising computer instructions stored in a computer readable storage medium; the processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, so that the computer device executes the steps in the method for three-dimensional reconstruction of the target object according to the embodiments of the present application.

According to the target object three-dimensional reconstruction method, the target object three-dimensional reconstruction device, the computer equipment and the storage medium, a three-dimensional parameterized model of the target object is established according to a two-dimensional image of the target object, a three-dimensional continuous surface of the target object is generated according to the two-dimensional image, the three-dimensional continuous surface is subjected to grid division to obtain a three-dimensional grid surface, then model parameters of the parameterized model are adjusted to perform registration processing on the grid surface and the three-dimensional grid surface in the parameterized model to obtain a final target parameterized model of the target object. Because the three-dimensional continuous surface has the detail information of the target object, the shape expression capability of the target object is stronger through the target parameterized model obtained by carrying out registration processing on the grid surface and the three-dimensional grid surface in the parameterized model, the detail information of the target object can be obtained, the limitation that the parameterized model of the target object obtained through three-dimensional reconstruction of the parameterized model has limited shape expression of the target object and lacks the detail information is avoided, and the accuracy of three-dimensional reconstruction of the target object is improved.

Drawings

FIG. 1 is a diagram illustrating an exemplary embodiment of a method for three-dimensional reconstruction of a target object;

FIG. 2 is a schematic flow chart illustrating a three-dimensional reconstruction method of a target object according to an embodiment;

FIG. 3 is a schematic diagram of creating a parameterized model and a three-dimensional continuous surface in one embodiment;

FIG. 4 is a schematic illustration of a registration process of a parameterized model and a three-dimensional continuous surface in one embodiment;

FIG. 5 is a diagram illustrating the determination of the result of semantic segmentation of surface vertices of a first parameterized model in one embodiment;

FIG. 6 is a diagram of determining an initial texture map and a visibility map, in one embodiment;

FIG. 7 is a diagram illustrating a texture map of a target object based on fusion of an initial texture map and a visibility map, under an embodiment;

FIG. 8 is a diagram illustrating the effect of a parameterized model of an object in one embodiment;

FIG. 9 is a diagram illustrating motion driving of a target parameterized model in one embodiment;

FIG. 10 is a schematic overall flowchart of a three-dimensional reconstruction method of a target object according to an embodiment;

FIG. 11 is a block diagram of an apparatus for three-dimensional reconstruction of a target object according to an embodiment;

FIG. 12 is a block diagram showing the structure of a three-dimensional reconstruction apparatus for a target object according to another embodiment;

FIG. 13 is a diagram showing an internal structure of a computer device in one embodiment;

fig. 14 is an internal structural view of a computer device in another embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The three-dimensional reconstruction method for the target object can be applied to the application environment shown in fig. 1. The image acquisition device 102 may acquire an image of the target object 104, and the computer device 106 may acquire a two-dimensional image acquired by the image acquisition device 102, and perform three-dimensional reconstruction on the target object according to the two-dimensional image by using the target object three-dimensional reconstruction method in the embodiments of the present application, so as to obtain a target parameterized model of the target object. Computer device 106 may acquire a two-dimensional image from image capture device 102 by communicating with image capture device 102 over a network. The two-dimensional image captured by the image capture device 102 may also be manually stored in the computer device 106.

The image capturing device 102 may be, but is not limited to, various cameras, video recorders, and the like. The target object 104 may be, but is not limited to, a human body, an animal, a robot, and the like. The computer device 106 may be a terminal or a server, or may be implemented by both a terminal and a server. The terminal can be but not limited to various personal computers, notebook computers, smart phones, tablet computers, vehicle-mounted computers and portable wearable devices, the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a CDN (content distribution network), big data and artificial intelligence platforms and the like.

It should be noted that, in other embodiments, the computer device 106 may also directly acquire the two-dimensional image of the target object 104 from a database or a storage device, instead of using the image acquisition device 102 to acquire the two-dimensional image of the target object 104.

It can be understood that the three-dimensional reconstruction method of the target object in the embodiments of the present application can effectively implement three-dimensional reconstruction of the target object by using a computer vision technology, a machine learning technology, and the like in an artificial intelligence technology.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Computer Vision technology (CV) Computer Vision is a science for researching how to make a machine "see", and further refers to that a camera and a Computer are used to replace human eyes to perform machine Vision such as identification, tracking and measurement on a target, and further image processing is performed, so that the Computer processing becomes an image more suitable for human eyes to observe or transmitted to an instrument to detect. As a scientific discipline, computer vision research-related theories and techniques attempt to build artificial intelligence systems that can capture information from images or multidimensional data. Computer vision technologies generally include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technologies, virtual reality, augmented reality, synchronous positioning, map construction, and other technologies, and also include common biometric technologies such as face recognition and fingerprint recognition.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

In an embodiment, as shown in fig. 2, a method for three-dimensional reconstruction of a target object is provided, and the embodiment of the present application is described by taking the method as an example applied to the computer device in fig. 1, and includes the following steps:

step 202, a two-dimensional image including a target object is acquired.

Wherein the target object is an object for which a three-dimensional reconstruction is aimed. The two-dimensional image including the target object refers to image content of the target object contained in the two-dimensional image.

In one embodiment, the target object may be any one of a human body, an animal, a robot, and the like.

In one embodiment, the image capture device may capture an image of the target object to obtain a two-dimensional image of the target object, and the computer device may then acquire the two-dimensional image captured by the image capture device.

In one embodiment, the computer device may acquire a two-dimensional image of the target object from the image acquisition device through a network. In another embodiment, the two-dimensional image acquired by the image acquisition device can be manually stored in the computer device through any one of a memory card, a mobile hard disk, a U disk and the like.

In one embodiment, the computer device may also directly acquire a two-dimensional image of an already existing target object. For example, the computer device may obtain a two-dimensional image of an existing target object from a database or locally.

In one embodiment, if the target object is a human body, the target object may be set to a standard posture suitable for three-dimensional reconstruction, and the image acquisition device performs image acquisition on the human body maintaining the standard posture to obtain a two-dimensional image of the target object. Such as: the standard gesture may be the gesture of figure 3 in the shape of the letter "a" that the person has placed. It is understood that the standard posture can be arbitrarily set according to the three-dimensional reconstruction requirement, which is not limited to this.

In one embodiment, the computer device may first acquire a plurality of frames of two-dimensional images including the target object, then select one of the frames of two-dimensional images from the plurality of frames of two-dimensional images, and perform step 204 and subsequent steps according to the selected single frame of two-dimensional image.

In one embodiment, the selected single-frame two-dimensional image is a two-dimensional image suitable for three-dimensional reconstruction, such as: may be a two-dimensional image of the front side of the target object.

In one embodiment, the plurality of frames of two-dimensional images may include two-dimensional images taken in respective different directions of the target object. In one embodiment, the plurality of frames of two-dimensional images may include two-dimensional images taken in respective different directions around the target object.

In one embodiment, the image capture device may wrap around the target object and capture multiple frames of two-dimensional images of the target object during the process. In another embodiment, the target object may rotate one revolution, during which multiple frames of two-dimensional images of the target object are acquired by the image acquisition device at a fixed location.

In one embodiment, the plurality of frames of two-dimensional images may be two-dimensional images extracted from continuously photographed videos. In another embodiment, the multi-frame two-dimensional image may also be two-dimensional images obtained by taking pictures for multiple times respectively.

In one embodiment, the computer device may cut out the two-dimensional image of the region where the target object is located from the two-dimensional image including the target object, and then perform step 204 and the following steps according to the cut-out two-dimensional image of the region where the target object is located. The two-dimensional image of the region where the target object is located is a two-dimensional image in which the image content corresponding to the target object occupies the main region. In this embodiment, the two-dimensional image of the region where the target object is located is cut, so that the problems that a plurality of objects exist in the original two-dimensional image including the target object, or the target object is not located at the midpoint of the two-dimensional image, and the like can be avoided, and the accuracy of subsequent three-dimensional reconstruction can be improved.

In one embodiment, the computer device may first detect a key point of the target object or a target frame in which the target object is located from an original two-dimensional image including the target object, and then cut out a two-dimensional image of an area in which the target object is located from the original two-dimensional image according to the key point or the target frame. As shown in fig. 3, 302 is an original two-dimensional image, the person inside is a target object, and the target is detected and cut out from 302, i.e. a two-dimensional image 304 of the area where the target object is located is cut out.

In one embodiment, the computer device may detect a key point of the target object or a target frame in which the target object is located from a two-dimensional image including the target object using a human key point detection network (Keypoint R-CNN).

Step 204, a three-dimensional parameterized model of the target object is created from the two-dimensional image.

The three-dimensional parameterized model is a three-dimensional model of the target object obtained by extracting model parameters from a two-dimensional image of the target object and modeling according to the model parameters. The surface of a three-dimensional parameterized model is formed by a mesh of vertices and patches.

In one embodiment, the three-dimensional parameterized model may be a parameterized model of a human body or a parameterized model of an animal. In one embodiment, the parameterized model of the human body may be any one of an SMPL model (Skinned multi-person linear skin model), an SMPLH model (Skinned multi-person linear skin model with gestures), an SMPLX model (Skinned multi-person linear skin model with gestures and facial expressions), and the like. It is to be understood that there are many types of parameterized models, and the parameterized model used herein is not limited to be used, and any parameterized model that separately models the shape and the posture may be used.

In one embodiment, the computer device may extract model parameters from the two-dimensional image and then input the model parameters into the initial parameterized model, creating a parameterized model of the target object. Specifically, the computer device may extract the shape parameters and the pose parameters of the target object from the two-dimensional image, and then input the shape parameters and the pose parameters into the initial parameterized model, creating the parameterized model of the target object.

The initial parameterized model is a parameterized model without inputting shape parameters and posture parameters of the target object. It can be understood that the initial parameterized model does not have the specific shape and the specific pose of the target object, and after the shape parameters and the pose parameters of the target object are input into the initial parameterized model, the parameterized model corresponding to the target object can be created.

In one embodiment, if the target object is a human body, the computer device may create an SMPLX model (multi-person linear skin model with gestures and facial expressions) of the target object from the two-dimensional image using an Expose method (a method of creating a SMPLX model). Compared with the SMPL model, the parameterized model created by the method increases the gestures and facial expressions of the human body, and improves the resolution, so that the accuracy of three-dimensional reconstruction can be improved.

Step 206, generating a three-dimensional continuous surface of the target object according to the two-dimensional image; the three-dimensional continuous surface is a three-dimensional surface obtained by continuously representing the surface of the target object.

It will be appreciated that the three-dimensional continuous surface is provided with detailed information of the target object.

In one embodiment, the three-dimensional continuous surface may be an implicit curved surface. An Implicit Surface (Surface representation) is a Surface defined by an Implicit function, that is, an iso-Surface defined by an Implicit function.

In one embodiment, the computer device may generate the implicit surface of the target object from the two-dimensional image by a PIFuHD method (a method of generating the implicit surface of the target object through an end-to-end trainable model framework from coarse to fine).

As shown in fig. 3, the computer device may create a parameterized model 308 of the target object and a three-dimensional continuous surface 306 of the target object from a two-dimensional image 304 of the region in which the target object is located, respectively. It can be understood that, because the three-dimensional continuous surface is continuous and has a large resolution, the three-dimensional continuous surface 306 has the detail information of the target object, as can be seen from fig. 3, such as: clothes, hair, shoes and the like. The parameterized model is created by inputting the shape parameters and the pose parameters into the initial parameterized model, so that the parameterized model 308 of the target human body is generally a naked human body model as shown in fig. 3, and lacks detailed information such as clothes, shoes, hair and the like of the target human body.

And 208, performing mesh division on the three-dimensional continuous surface to obtain a three-dimensional mesh surface.

The three-dimensional Mesh surface (Mesh) refers to a Mesh surface composed of vertices and patches.

In one embodiment, the computer device may adopt a Marching cubes method (iso-surface extraction algorithm, a method for extracting a mesh) to perform mesh division on the three-dimensional continuous surface to obtain a three-dimensional mesh surface.

In other embodiments, the computer device may also adopt other methods for extracting meshes to perform mesh division on the three-dimensional continuous surface to obtain a three-dimensional mesh surface, which is not limited.

It can be understood that the parameterized model and the three-dimensional continuous surface have different forms, the parameterized model is a mesh composed of vertices and patches, and the three-dimensional continuous surface is a three-dimensional surface represented continuously, so in this embodiment, the three-dimensional continuous surface is subjected to mesh division to obtain a three-dimensional mesh surface, which can make the parameterized model and the three-dimensional continuous surface have the same form, and facilitate subsequent registration processing between the parameterized model and the three-dimensional continuous surface, that is, only the mesh surface in the parameterized model and the three-dimensional mesh surface need to be subjected to registration processing in the registration processing process.

And 210, adjusting model parameters of the parameterized model to perform registration processing on the grid surface and the three-dimensional grid surface in the parameterized model to obtain a final target parameterized model of the target object.

Among them, Registration processing (Registration) refers to processing for adjusting features such as shapes, postures, and details between two mesh surfaces to be consistent. The target parameterized model is a parameterized model finally obtained by three-dimensional reconstruction of a target object.

In one embodiment, an offset parameter may be added to the parameterized model, and the model parameters of the adjusted parameterized model may include the offset parameter. In one embodiment, the model parameters of the adapted parameterized model may further include at least one of global feature parameters, pose parameters, shape parameters, and the like. The offset parameter is a model parameter for representing the offset of the surface vertex of the parametric model. The global feature parameter is a model parameter for characterizing a global feature of the parameterized model. The attitude parameter is a model parameter for characterizing the attitude of the parameterized model. The shape parameter is a model parameter for characterizing the shape of the parametric model.

In one embodiment, the computer device may sequentially adjust model parameters for characterizing features of the parameterized model at corresponding layers in order from global to local, so as to perform registration processing on a mesh surface in the parameterized model and a three-dimensional mesh surface, and obtain a final target parameterized model of the target object. Such as: the computer device may adjust the global feature parameters (corresponding to the global level), then the shape parameters and the pose parameters (corresponding to the level between global and local), and finally the offset parameters (corresponding to the local level). It can be understood that after the model parameter for characterizing the feature of a certain aspect is adjusted each time, the parameter value of the adjusted model parameter corresponding to the aspect is fixed, and then the model parameter corresponding to the next aspect is adjusted. In the embodiment, the model parameters are sequentially adjusted from global to local, so that the problem of overlarge local deformation caused by directly adjusting the model parameters of the local layer can be avoided, the registration accuracy is improved, and the three-dimensional reconstruction accuracy is further improved.

In one embodiment, the computer device may obtain a pre-constructed objective loss function, iteratively adjust model parameters of the parameterized model towards minimizing the objective loss function to register a mesh surface in the parameterized model with the three-dimensional mesh surface.

In one embodiment, the objective loss functions used in the adjustment of the different model parameters by the computer device may be the same or different. That is, the corresponding objective loss function can be designed based on the difference in the characteristics of the parameterized model characterized by the adjusted model parameters.

In one embodiment, the target loss function may include at least one of a mesh difference loss function, a surface topology loss function, a deformation loss function, and the like.

As shown in fig. 4, the computer device may perform a registration process of the parameterized model 308 created in fig. 3 with the three-dimensional mesh surface of the three-dimensional continuous surface 306, resulting in the target parameterized model 402 shown in fig. 4. As can be seen from the figure, the target parameterized model 402 has detail information of a three-dimensional continuous surface, so that the limitations that the parameterized model has limited shape expression of the target object and lacks of the detail information are avoided, and the accuracy of three-dimensional reconstruction of the target object is improved. In one embodiment, the computer device may obtain a texture map of the target object, and then perform texture rendering on the target parameterized model according to the texture map to obtain the texture-rendered target parameterized model. The texture map is used for representing information such as high-frequency details, colors and the like of the surface of the parameterized model.

In one embodiment, the computer device may obtain a plurality of frames of two-dimensional images including the target object, and then obtain a texture map of the target object according to the plurality of frames of two-dimensional images.

In one embodiment, the computer device may drive the target parameterized model to perform the action according to the action parameters. The motion parameters are model parameters for characterizing the motion of the target parameterized model.

In the three-dimensional reconstruction method of the target object, a three-dimensional parameterized model of the target object is established according to a two-dimensional image of the target object, a three-dimensional continuous surface of the target object is generated according to the two-dimensional image, the three-dimensional continuous surface is subjected to grid division to obtain a three-dimensional grid surface, then model parameters of the parameterized model are adjusted to perform registration processing on the grid surface and the three-dimensional grid surface in the parameterized model, and a final target parameterized model of the target object is obtained. Because the three-dimensional continuous surface has the detail information of the target object, the target parameterized model obtained by performing the registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model has stronger shape expression capability on the target object, and can have the detail information of the target object, such as: the method avoids the limitation that the parameterized model of the target object obtained by three-dimensional reconstruction of the parameterized model has limited shape expression of the target object and lacks of detailed information, and improves the accuracy of three-dimensional reconstruction of the target object. Also, parameterized models tend to be ambiguous in depth. For example, the legs of the parameterized model of a short target person are often incomplete, and the target parameterized model obtained by the method avoids depth ambiguity and improves the accuracy of three-dimensional reconstruction.

In addition, the three-dimensional model of the target object based on three-dimensional continuous surface modeling cannot be motion-driven, and has limitations, while the parameterized model can realize motion driving. Therefore, model parameters of the parameterized model are adjusted, the target parameterized model is obtained after registration processing is carried out on the grid surface in the parameterized model and the three-dimensional grid surface extracted from the three-dimensional continuous surface, the parameterized model has the advantage that the parameterized model can be driven by actions, the limitation that the three-dimensional continuous surface cannot be driven is avoided, the target parameterized model has detail information, the accuracy of three-dimensional reconstruction is improved, the target parameterized model can be driven by actions, and the model can be flexibly driven by the actions, so that the applicability of the target parameterized model obtained by three-dimensional reconstruction is improved, and the application range is widened.

In one embodiment, the method further comprises: and adding an offset parameter into the parameterized model to obtain the deformable parameterized model. In this embodiment, the step 210 of adjusting the model parameters of the parameterized model to perform the registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model includes: and adjusting the offset parameters corresponding to the surface vertices of the deformable parameterized model to perform registration processing on the mesh surface in the parameterized model and the three-dimensional mesh surface.

The deformable parameterized model is a parameterized model in which parameter values of offset parameters of the surface vertices can be adjusted, that is, a parameterized model in which the surface vertices can be offset.

In one embodiment, the computer device may adjust offset parameters corresponding to surface vertices of the deformable parameterized model to perform registration processing on the mesh surface in the parameterized model and the three-dimensional mesh surface to obtain a final target parameterized model of the target object.

In one embodiment, before adjusting the offset parameters corresponding to the surface vertices of the deformable parameterized model, the computer device may adjust model parameters of a global layer of the deformable parameterized model to perform registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model, then fix the adjusted model parameters of the global layer, and adjust the offset parameters corresponding to the surface vertices of the deformable parameterized model to perform registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model, so as to obtain a final target parameterized model of the target object. Such as: the computer device may adjust the global feature parameters first, then the shape parameters and the pose parameters, and then the offset parameters. In the embodiment, the model parameters of the global layer are adjusted first, and then the offset parameters corresponding to the surface vertexes are adjusted, so that the problem of overlarge local offset caused by directly adjusting the offset parameters can be avoided, the registration accuracy is improved, and the three-dimensional reconstruction accuracy is further improved.

In one embodiment, the computer device may determine surface vertices corresponding to an incomplete part expressed by the three-dimensional continuous surface according to semantic segmentation results of the surface vertices of the parameterized model, fix offset parameters of the determined surface vertices, and adjust offset parameters corresponding to surface vertices other than the determined surface vertices in the deformable parameterized model, so as to perform registration processing on the mesh surface in the parameterized model and the three-dimensional mesh surface.

And the semantic segmentation result of the surface vertex is used for representing the part of the target object to which the surface vertex in the parameterized model belongs. Such as: if the target object is a human body, the surface vertex semantic segmentation result may include at least one of a face, a hand, a foot, a trunk, and the like. Namely, for example: if the semantic segmentation result of the surface vertex is a hand, the surface vertex belongs to the hand of the target human body.

It is understood that the three-dimensional continuous surface may represent incomplete areas of the target object, i.e., incomplete portions of the three-dimensional continuous surface.

In the above embodiment, the deformable parameterized model is obtained by adding the offset parameter to the parameterized model, and the offset parameter corresponding to the surface vertex of the deformable parameterized model is adjusted to perform registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model, so that the registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model can be more accurate, registration of local detail features is realized, the target parameterized model can have detail information of a three-dimensional continuous surface, the limitation that the parameterized model has limited expression on the shape of the target object and lacks detail information is avoided, and the accuracy of three-dimensional reconstruction of the target object is improved.

In one embodiment, the model parameters of the deformable parameterized model further include global feature parameters, pose parameters, and shape parameters. In this embodiment, adjusting offset parameters corresponding to surface vertices of the deformable parameterized model to perform registration processing on the mesh surface in the parameterized model and the three-dimensional mesh surface includes: adjusting global feature parameters of the deformable parameterized model to depth register mesh surfaces in the parameterized model with the three-dimensional mesh surfaces; adjusting the posture parameters and the shape parameters of the parameterized model after the depth registration so as to perform registration of the mesh surface in the parameterized model and the three-dimensional mesh surface; and adjusting the offset parameters corresponding to the surface vertexes of the superposed and registered parameterized model so as to register the characteristic parts of the mesh surface and the three-dimensional mesh surface in the parameterized model.

The depth registration refers to performing registration processing on a mesh surface and a three-dimensional mesh surface in the parameterized model in the depth direction. And the registration refers to performing registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model towards a registration direction. And feature registration, namely performing registration processing on a local detail feature on a grid surface and a three-dimensional grid surface in the parameterized model.

In one embodiment, the global feature parameters may include a global direction parameter and a global transformation quantity parameter of the parameterized model. The global direction parameter is a model parameter for representing the overall rotation direction of the parameterized model. The global transformation parameter is a model parameter for representing the entire displacement of the parameterized model.

In one embodiment, the computer device may first adjust a global feature parameter of the deformable parameterized model to perform depth registration on the mesh surface in the parameterized model and the three-dimensional mesh surface, then fix a parameter value of the adjusted global feature parameter, adjust a pose parameter and a shape parameter of the parameterized model after the depth registration to perform registration of the mesh surface in the parameterized model and the three-dimensional mesh surface, then fix parameter values of the adjusted global feature parameter, the pose parameter and the shape parameter, and adjust an offset parameter corresponding to a surface vertex of the parameterized model after the registration to perform feature registration on the mesh surface in the parameterized model and the three-dimensional mesh surface, so as to obtain a final target parameterized model of the target object.

In the above embodiment, because the parameterized model created from the two-dimensional image often has depth ambiguity, which causes a large deviation in the depth direction between the parameterized model and the three-dimensional mesh surface of the three-dimensional continuous surface, the global characteristic parameters of the deformable parameterized model are adjusted first, so that the parameterized model and the three-dimensional mesh surface can be aligned in the global depth, and the deviation in the depth direction is avoided. And then adjusting the shape and the posture of the parameterized model by adjusting the shape parameters and the posture parameters, so that the parameterized model and the three-dimensional grid surface can be coincided. After the global characteristic parameters, the shape parameters and the attitude parameters are adjusted, the offset parameters are adjusted, the problem of overlarge local offset caused by directly adjusting the offset parameters can be avoided, the registration accuracy is improved, and the three-dimensional reconstruction accuracy is further improved.

In one embodiment, adjusting the offset parameters corresponding to the surface vertices of the deformable parameterized model to register the mesh surfaces in the parameterized model with the three-dimensional mesh surfaces comprises: obtaining a surface vertex semantic segmentation result of the parameterized model; determining surface vertexes with severe change of geometric shape in the deformable parameterized model according to the semantic segmentation result of the surface vertexes; and fixing the offset parameters corresponding to the determined surface vertices, and adjusting the offset parameters corresponding to the surface vertices except the determined surface vertices in the deformable parameterized model so as to perform registration processing on the mesh surface in the parameterized model and the three-dimensional mesh surface.

Specifically, the target semantic segmentation result corresponding to the surface vertex with the drastically changed geometric shape may be set in advance, and the computer device may determine the surface vertex corresponding to the target semantic segmentation result according to the semantic segmentation result corresponding to each surface vertex of the parameterized model, and use the determined surface vertex as the surface vertex with the drastically changed geometric shape.

The target semantic segmentation result refers to a semantic segmentation result corresponding to a surface vertex with a severely changed geometric shape.

It can be understood that, because the three-dimensional continuous surface has the problems of incomplete expression or fuzzy expression and the like for the position where the geometric shape changes severely in the target object, the offset parameters of the surface vertices where the geometric shape changes severely in the parameterized model are fixed, and only the offset parameters corresponding to the surface vertices except the surface vertices where the geometric shape changes severely are adjusted, so as to perform registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model, the above limitation problem of the three-dimensional continuous surface can be avoided.

Such as: if the target object is a human body, the target semantic segmentation result may include at least one of a face, a hand, and the like. It can be understood that, because the geometric shapes of the parts of the human body, such as the face and the hand, are changed drastically, at least one of the face and the hand is preset as a target semantic segmentation result. The problems of incomplete expression, fuzzy expression and the like easily occur to the parts such as the face, the hands and the like in the three-dimensional continuous surface of the target human body. As shown in fig. 3 and 4, the hand of the three-dimensional continuous surface has a problem of being incomplete, and the face has a problem of expression blurring. As shown in fig. 4, the target parameterized model obtained after the registration processing is performed under the constraint of the semantic segmentation result 404 of the surface vertices of the parameterized model retains the integrity of the hands and faces in the original parameterized model, and avoids the above limitations of the three-dimensional continuous surface.

In one embodiment, the computer device may fix the parameter values of the offset parameters corresponding to the determined surface vertices to 0 and adjust the offset parameters corresponding to the surface vertices other than the determined surface vertices in the deformable parameterized model to register the mesh surface in the parameterized model with the three-dimensional mesh surface.

In the above embodiment, the computer device may fix the offset parameter of the surface vertex with the drastic change in the geometric shape according to the surface vertex semantic segmentation result of the parameterized model, and adjust only the offset parameters corresponding to the surface vertices except the determined surface vertex to perform registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model, thereby avoiding the problems of incomplete expression or fuzzy expression of the three-dimensional continuous surface on the position with the drastic change in the geometric shape, and the like, so that the target parameterized model can reconstruct the position with the drastic change in the geometric shape completely, and the accuracy of three-dimensional reconstruction is improved.

In one embodiment, the two-dimensional image is a single frame two-dimensional image, and the parameterized model created from the single frame two-dimensional image is the first parameterized model. In this embodiment, obtaining the surface vertex semantic segmentation result of the parameterized model includes: acquiring a multi-frame two-dimensional image comprising a target object; acquiring a second parameterized model of the target object correspondingly created according to each two-dimensional image in the multiple two-dimensional images; semantic segmentation is carried out on each frame of two-dimensional image in the multi-frame of two-dimensional image to obtain a partially visible surface vertex semantic segmentation result of the corresponding second parameterized model; and determining the surface vertex semantic segmentation result of the first parameterized model according to the partially visible surface vertex semantic segmentation result of each second parameterized model.

The partially visible surface vertex semantic segmentation result refers to a semantic segmentation result with only one part of surface vertices, and a speech segmentation result with another part of surface vertices lacking.

In one embodiment, the single two-dimensional image may or may not be included in the multi-frame two-dimensional image.

In one embodiment, a multi-frame two-dimensional image including a target object may be acquired by the method in the foregoing embodiments of acquiring a multi-frame two-dimensional image.

In one embodiment, the computer device may create the second parameterized model of the target object according to each two-dimensional image of the plurality of frames of two-dimensional images respectively by the method in the foregoing embodiments for creating the three-dimensional parameterized model of the target object according to the two-dimensional images. And respectively creating a second parameterized model for each two-dimensional image.

In one embodiment, the computer device may perform semantic segmentation on each two-dimensional image of the multiple frames of two-dimensional images, to obtain semantic segmentation results corresponding to each two-dimensional image. Then, the computer device may determine, according to the semantic segmentation results respectively corresponding to the two-dimensional images of the frames, the partially visible surface vertex semantic segmentation results of the corresponding second parameterized model respectively. That is, for each frame of the two-dimensional image, the computer device may determine, according to the semantic segmentation result of the frame of the two-dimensional image, a partially visible surface vertex semantic segmentation result of the second parameterized model corresponding to the frame of the two-dimensional image.

In one embodiment, the computer device may employ a Human matching network (RP-R-CNN, a deep learning network for semantically segmenting images) to semantically segment each frame of the two-dimensional image separately.

It can be understood that, because the two-dimensional image can only express the image of the target object in a certain direction, the surface vertex semantic segmentation result of the second parameterized model obtained from the two-dimensional image is partially visible, and the complete surface vertex semantic segmentation result of the parameterized model cannot be obtained.

In one embodiment, the computer device may perform voting fusion according to semantic segmentation results in the second parameterized models corresponding to the surface vertices of the first parameterized model, respectively, to determine the semantic segmentation results of the surface vertices of the first parameterized model. Specifically, for a surface vertex of the first parameterized model, the computer device may use each semantic segmentation result corresponding to the surface vertex in each second parameterized model as a candidate semantic segmentation result of the surface vertex, and then determine a final semantic segmentation result of the surface vertex according to each candidate semantic segmentation result. It can be understood that after the final semantic segmentation result of each surface vertex in the first parameterized model is obtained, the surface vertex semantic segmentation result of the first parameterized model is obtained.

In one embodiment, the computer device may select a final semantic segmentation result of the surface vertex from the candidate semantic segmentation results according to the occurrence frequency of the candidate semantic segmentation results.

In one embodiment, the computer device may determine the candidate semantic segmentation result with the highest frequency of occurrence as the final semantic segmentation result of the surface vertex.

For example: the surface vertices a of the first parameterized model correspond to the semantic segmentation results in the second parameterized models B, C and D as hands, arms, and hands, respectively, and the "hands" can be determined as the semantic segmentation result of the surface vertices a because the occurrence frequency of the "hands" is the largest.

As shown in fig. 5, the computer device performs semantic segmentation on multiple frames of two-dimensional images to obtain semantic segmentation results corresponding to the two-dimensional images, respectively creates second parameterized models according to the two-dimensional images, and performs voting fusion according to the semantic segmentation results corresponding to the two-dimensional images and the corresponding second parameterized models to obtain surface vertex semantic segmentation results of the first parameterized models. The surface vertices may be represented in different colors according to the semantic segmentation result, and for clarity, the regions of different semantic segmentation results in the first parameterized model are labeled with different numbers in fig. 5.

In the above embodiment, the computer device determines the partially visible surface vertex semantic segmentation results of the corresponding second parameterized models according to the semantic segmentation results respectively corresponding to the multiple frames of two-dimensional images, and then determines the surface vertex semantic segmentation result of the first parameterized model according to the partially visible surface vertex semantic segmentation results of the second parameterized models. It can be understood that the surface vertex semantic segmentation result in each second parameterized model is not complete but partially visible, and then the surface vertex semantic segmentation result of the first parameterized model obtained according to the surface vertex semantic segmentation result in each second parameterized model is obtained by fusing the partially visible surface vertex semantic segmentation result in each second parameterized model, so that the surface vertex semantic segmentation result of the parameterized model is more complete, and the accuracy of the surface vertex semantic segmentation result of the parameterized model is improved.

In one embodiment, after determining the surface vertex semantic segmentation result of the first parameterized model from the surface vertex semantic segmentation results of each second parameterized model, the method further comprises: determining invisible surface vertices of undetermined semantic segmentation results in the first parameterized model; determining visible surface vertexes within a preset neighborhood range of the invisible surface vertexes; and determining the semantic segmentation result of the invisible surface vertex according to the determined semantic segmentation result of the visible surface vertex.

And the invisible surface vertex is a surface vertex of the undetermined semantic segmentation result in the first parameterized model. Visible surface vertices are the surface vertices of the first parameterized model for which semantic segmentation results have been determined. The preset neighborhood range is a range preset to be adjacent to the surface vertex.

Such as: since it is difficult to photograph the sole of the target object in the multi-frame two-dimensional image of the target object, the surface vertices at the sole of the target object are invisible surface vertices.

Specifically, after determining the surface vertex semantic segmentation result of the first parameterized model according to the surface vertex semantic segmentation result of each second parameterized model, the computer device may determine an invisible surface vertex of the first parameterized model for which the semantic segmentation result is not determined, then determine a visible surface vertex of the determined semantic segmentation result within a preset neighborhood range of the invisible surface vertex, and then determine the semantic segmentation result of the invisible surface vertex according to the determined semantic segmentation result of the visible surface vertex.

In one embodiment, the predetermined neighborhood range may be a region within a predetermined neighborhood radius. The preset neighborhood radius refers to the radius of a preset neighborhood range. Such as: the preset neighborhood radius is 3.

In one embodiment, the computer device may select the semantic segmentation result of the invisible surface vertex from the determined semantic segmentation results of the visible surface vertices according to the occurrence frequency of the determined semantic segmentation results of the visible surface vertices.

In one embodiment, the computer device may select the semantic segmentation result with the highest frequency of occurrence from the determined semantic segmentation results of the visible surface vertices as the semantic segmentation result of the invisible surface vertices.

For example: the visible surface vertexes within the preset neighborhood range of the invisible surface vertex A are B, C and D, and the semantic segmentation results of B, C and D are a hand, a face and a hand respectively. Since the semantic segmentation result of "hand" appears most frequently, the "hand" can be determined as the semantic segmentation result of the invisible surface vertex a.

In the above embodiment, the semantic segmentation result of the invisible surface vertex is determined according to the semantic segmentation result of the visible surface vertex in the preset neighborhood range of the invisible surface vertex, so that the problem that the invisible surface vertex exists in the surface vertex semantic segmentation result of the parameterized model is avoided. The completeness of the surface vertex semantic segmentation result of the parameterized model is improved, and the accuracy of the surface vertex semantic segmentation result of the parameterized model is improved.

In one embodiment, the step 210 of adjusting model parameters of the parameterized model to register the mesh surface in the parameterized model with the three-dimensional mesh surface comprises: obtaining a target loss function of a plurality of constraints; the target loss function comprises a grid difference loss function, a surface topological structure loss function and a deformation loss function; iteratively adjusting model parameters of the parameterized model toward minimizing the objective loss function to register a mesh surface in the parameterized model with the three-dimensional mesh surface.

Wherein the mesh difference loss function is a loss function for characterizing mesh differences between a mesh surface and a three-dimensional mesh surface in the parameterized model. The surface topology loss function is a loss function for maintaining the topology of the surface during deformation of the parameterized model. The deformation loss function is a loss function for restricting the deformation amplitude of the parameterized model. It will be appreciated that the deformation loss function may be used to constrain the magnitude of deformation of the parameterized model to prevent the parameterized model from deforming too much.

In one embodiment, the target loss function used in the depth registration process comprises a mesh difference loss function. The computer device may iteratively adjust global feature parameters of the parameterized model towards minimizing an objective loss function comprising a mesh difference loss function to depth register a mesh surface in the parameterized model with the three-dimensional mesh surface.

In one embodiment, the target loss function used in the registration process comprises a mesh difference loss function. The computer device may iteratively adjust pose parameters and shape parameters of the parameterized model toward minimizing an objective loss function comprising a mesh difference loss function to register mesh surfaces in the parameterized model coincident with the three-dimensional mesh surfaces.

In one embodiment, the target loss functions used in the feature registration process include mesh difference loss functions, surface topology loss functions, and deformation loss functions. The computer device may iteratively adjust offset parameters corresponding to surface vertices of the parameterized model toward minimizing a target loss function comprising a mesh difference loss function, a surface topology loss function, and a deformation loss function to feature registration of a mesh surface in the parameterized model with a three-dimensional mesh surface.

In one embodiment, the mesh difference loss function may include a sum of minimum distances from surface vertices in the parameterized model to three-dimensional mesh surfaces divided by the three-dimensional continuous surface, and a sum of minimum distances from surface vertices in the three-dimensional mesh surfaces divided by the three-dimensional continuous surface to mesh surfaces of the parameterized model. In the embodiment, the minimum distance from the respective vertex of the two models (namely the parameterized model and the three-dimensional mesh surface divided by the three-dimensional continuous surface) to the surface of the other model is optimized in a two-way mode, so that the two models are overlapped, the accuracy of the target parameterized model is improved, and the accuracy of three-dimensional reconstruction is improved.

In one embodiment, the grid difference loss function is represented by the following equation (1):

wherein L is_p2sRepresenting a mesh difference loss function, S representing surface vertices of the parameterized model, p e S representing p being a surface vertex of the parameterized model, M representing a patch of a mesh surface of the parameterized model, f e M representing f being a patch of a mesh surface of the parameterized model.

Surface vertices representing a three-dimensional mesh surface divided by a three-dimensional continuous surface,

representation q is the surface vertices of a three-dimensional mesh surface divided by a three-dimensional continuous surface,

a patch representing a three-dimensional mesh surface divided by a three-dimensional continuous surface,

to represent

Is a patch of a three-dimensional mesh surface divided by a three-dimensional continuous surface.

Represents p to

I.e. the minimum distance of a surface vertex representing the parameterized model to the surface of the three-dimensional mesh divided by the three-dimensional continuous surface.

Represents the sum of the minimum distances from each surface vertex in the parameterized model to the three-dimensional mesh surface divided by the three-dimensional continuous surface.

The minimum distance from q to f is represented, i.e. the minimum distance from the surface vertex of the three-dimensional mesh surface divided by the three-dimensional continuous surface to the mesh surface of the parameterized model.

And respectively representing the sum of the minimum distances from each surface vertex of the three-dimensional mesh surface divided by the three-dimensional continuous surface to the mesh surface of the parameterized model.

In one embodiment, the surface topology loss function may include a surface topology difference loss function and a native surface topology loss function. The surface topological structure difference loss function is a loss function used for representing the change of the surface topological structure of the parameterized model before and after the model parameters are adjusted. The loss function of the surface topology structure of the self is used for representing the surface topology structure of the parameterized model after the parameters of the model are adjusted.

In one embodiment, the surface topology difference loss function may be a regularization of the difference in laplacian values of the mesh surface of the parameterized model before and after adjusting the model parameters. The Laplace value is used for representing local detail characteristics of the grid surface. In the embodiment, the condition that the surface topological structure of the parameterized model is changed too much due to registration processing is avoided by restricting the difference of the Laplace values of the mesh surfaces of the parameterized model before and after adjusting the parameters of the model, and the rationality and the accuracy of the target parameterized model are improved, so that the accuracy of three-dimensional reconstruction is improved.

In one embodiment, the surface topology difference loss function can be represented by the following equation (2):

wherein the content of the first and second substances,

representing the surface topology difference loss function, p representing the surface vertices of the parameterized model, δ_pDenotes a Laplace value, δ ', at a surface vertex p of the parameterized model after adjustment of the model parameters'_pRepresenting the laplacian values at the surface vertices p of the parameterized model before adjusting the model parameters. Laplace value delta_pCan be expressed by the following formula (3):

where N (p) represents neighbor surface vertices within the neighborhood of surface vertex p, and k ∈ N (p) represents k is a neighbor surface vertex within the neighborhood of surface vertex p. Can be understood as delta'_pIt can also be obtained by the calculation method of the above formula (3).

In one embodiment, the native surface topology loss function may be a regularization of the laplacian values of the mesh surface of the parameterized model after adjusting the model parameters. In the embodiment, by constraining the Laplace value of the grid surface of the parameterized model after the model parameters are adjusted, the situation that the surface topological structure of the target parameterized model changes violently is avoided, the smoothness of the surface of the target parameterized model is enhanced, the rationality and the accuracy of the target parameterized model are improved, and the accuracy of three-dimensional reconstruction is improved.

In one embodiment, the native surface topology loss function can be represented by the following equation (4):

wherein L is_lapRepresenting the loss function, delta, of the surface topology itself_pAnd a Laplace value of the mesh surface of the parameterized model after the model parameters are adjusted. Laplace value delta_pAnd can be expressed by equation (3) as well.

In one embodiment, the deformation loss function may be a regularization of parameter values of offset parameters of surface vertices of the parameterized model. In the embodiment, by constraining the parameter value of the offset parameter of the surface vertex of the parameterized model, the overlarge offset of the surface vertex is avoided, the rationality and the accuracy of the target parameterized model are improved, and the accuracy of three-dimensional reconstruction is improved.

In one embodiment, the deformation loss function may be represented by the following equation (5):

wherein L is_dRepresenting the deformation loss function, p representing the surface vertices of the parameterized model, d_pAnd a parameter value representing an offset parameter of the surface vertex p of the parameterized model after the model parameter is adjusted.

In one embodiment, the target loss function may be represented by the following equation (6):

wherein L is_p2sA function representing the loss of grid difference is expressed,

representing the surface topology difference loss function, L_lapRepresenting the loss function of the surface topology itself, L_dRepresenting the deformation loss function. Lambda [ alpha ]_p2s、

λ_lapAnd λ_dThe weights of the loss functions in the target loss function are respectively. The respective weights may be set according to actual conditions.

In one embodiment, the respective weights may be set to λ respectively_p2s＝1×10²、

λ_lap＝1×10³And λ_d＝1×10¹。

In one embodiment, the computer device may iteratively adjust model parameters of the parameterized model towards minimizing the objective loss function by an Adam optimizer (an algorithm for iteratively optimizing model parameters) to register a mesh surface in the parameterized model with a three-dimensional mesh surface. In one embodiment, the learning rate of the process of iteratively adjusting the model parameters of the parameterized model may be set to lr 1 × 10^-2。

In one embodiment, the computer device may implement the terms of the target loss function by Kaolin bag (a tool library of open sources applied to deep learning).

In the above embodiment, by iteratively adjusting the model parameters of the parameterized model in the direction of minimizing the objective loss function of the polynomial constraint to perform the registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model, the accuracy of the three-dimensional reconstruction can be improved, and the efficiency of the three-dimensional reconstruction can be improved. Specifically, the two models are overlapped by constructing a grid difference loss function, so that the accuracy of the target parameterized model is improved, and the accuracy of three-dimensional reconstruction is improved. By constructing the surface topological structure loss function, the rationality and the accuracy of the surface topological structure of the target parameterized model are improved, and the accuracy of three-dimensional reconstruction is improved. By constructing the deformation loss function and constraining the parameter values of the offset parameters of the surface vertexes of the parameterized model, overlarge offset of the surface vertexes is avoided, the rationality and the accuracy of the target parameterized model are improved, and the accuracy of three-dimensional reconstruction is improved.

In one embodiment, the two-dimensional image is a single frame two-dimensional image, and the parameterized model created from the single frame two-dimensional image is the first parameterized model. In this embodiment, the method further includes: acquiring a multi-frame two-dimensional image comprising a target object; acquiring a second parameterized model of the target object correspondingly created according to each two-dimensional image in the multiple two-dimensional images; mapping points corresponding to the target object in the corresponding two-dimensional image to a texture space according to texture coordinates corresponding to the surface vertex of each second parameterized model to obtain initial texture maps of the target object corresponding to each two-dimensional image frame; fusing the initial texture maps to obtain a texture map of the target object; and performing texture rendering on the target parameterized model according to the texture map.

The texture coordinates (UV coordinates) are the corresponding coordinates of the surface vertices of the parameterized model on the texture map. The texture space is a coordinate space based on the texture map. The initial texture map is a texture map obtained from a single two-dimensional image of a plurality of two-dimensional images. It will be appreciated that the initial texture map is not a complete texture map of the parametric model, since it is derived from a two-dimensional image of a single frame.

In an embodiment, the computer device may map, according to the texture coordinates corresponding to the surface vertices of the second parameterized models and the correspondence between the surface vertices of the second parameterized models and the points in the corresponding two-dimensional images, the points corresponding to the target object in the corresponding two-dimensional images to a texture space, to obtain texture coordinates of the points corresponding to the target object in the corresponding two-dimensional images, thereby forming initial texture maps of the target object corresponding to each frame of the two-dimensional images.

As shown in fig. 6, for each two-dimensional image, according to the two-dimensional image and the texture coordinates (e.g. 602 in fig. 6) corresponding to the surface vertices of the corresponding second parameterized model, the points corresponding to the target object in the two-dimensional image are mapped to the texture space (i.e. texture expansion), and the texture coordinates of the points corresponding to the target object in the corresponding two-dimensional image are obtained, so as to obtain the initial texture map of the target object corresponding to the two-dimensional image, as shown by 604 in fig. 6.

In one embodiment, the computer device may sort the initial texture maps, and then sequentially fuse the initial texture maps according to the sorted order to obtain the texture map of the target object.

In one embodiment, the computer device may order the initial texture maps according to a global orientation of the second parameterized models. The global direction refers to a parameter value of a global direction parameter.

In one embodiment, the computer device may perform texture rendering on the target parameterized model according to the texture map of the target object, to obtain a texture-rendered target parameterized model.

In one embodiment, the computer device may perform texture rendering on the target parameterized model according to the texture map of the target object through a Neural renderer (an algorithm for performing texture rendering), so as to obtain a texture-rendered target parameterized model.

In one embodiment, the computer device may first segment the foreground (i.e., the target object) from each frame of the two-dimensional image, and then obtain the initial texture map according to the two-dimensional image obtained by the segmentation. In one embodiment, the computer device may use a foreground segmentation model to segment the foreground (i.e., the target object) from the plurality of frames of two-dimensional images, respectively. In the embodiment, the background in the two-dimensional image is removed by segmenting the foreground, so that the problem of background information in the obtained initial texture mapping is solved, and the accuracy of obtaining the texture mapping is improved.

In one embodiment, the computer device may first perform a de-illumination process on each frame of the two-dimensional image to remove illumination in the two-dimensional image, thereby improving the accuracy and clarity of texture mapping. In one embodiment, the computer device may use a de-illumination model to de-illuminate each frame of the two-dimensional image separately.

In the above embodiment, according to multiple frames of two-dimensional images and corresponding second parameterized models, initial texture maps corresponding to the frames of two-dimensional images are obtained, and then the initial texture maps are fused to obtain an accurate texture map of the target object, and then the texture rendering is performed on the target parameterized model according to the texture maps to obtain the target parameterized model after the texture rendering, so that the information content contained in the target parameterized model is richer, and the applicability of the reconstructed target parameterized model is improved.

In one embodiment, the fusing the initial texture maps to obtain the texture map of the target object includes: determining a fusion sequence corresponding to each frame of two-dimensional image according to the root node direction of each second parameterized model; and fusing the initial texture maps corresponding to the two-dimensional images of each frame according to the fusion sequence to obtain the texture map of the target object.

And the root node direction is used for representing the global direction of the parameterized model. And the root node is a node on a key point skeleton of the parameterized model. The rotation and translation of the root node represent the rotation and translation of the entire parameterized model.

In an embodiment, the computer device may sequentially sort the plurality of frames of two-dimensional images according to the root node direction of each second parameterized model, so as to obtain a fusion sequence corresponding to each frame of two-dimensional image.

In one embodiment, the computer device may group the two-dimensional images of each frame according to the root node direction of the corresponding second parameterized model according to preset main directions, that is, divide the two-dimensional images into groups of main directions closest to the root node direction. And then in each group, sequencing the two-dimensional images of the group according to the consistency between the root node direction and the main direction of the group to obtain a fusion sequence corresponding to the two-dimensional images of each frame. And the consistency is used for representing the proximity degree of the root node direction and the main direction.

In one embodiment, the primary directions may include front, rear, left, and right 4 directions in one embodiment.

In an embodiment, the computer device may sequentially fuse the initial texture maps corresponding to the two-dimensional images according to the fusion sequence corresponding to each two-dimensional image to obtain the texture map of the target object.

In one embodiment, the computer device may first determine the visibility of each point on an initial texture map corresponding to the two-dimensional image, then select the current two-dimensional image from the first two-dimensional image according to the fusion sequence, and generate a texture map corresponding to the current two-dimensional image according to the initial texture map corresponding to the current two-dimensional image and the visibility of each point; fusing the texture map corresponding to the current two-dimensional image with the accumulated fused texture map; and after fusion, taking the next two-dimensional image as the current two-dimensional image according to the fusion sequence, iterating, returning to the step of generating the texture map corresponding to the current two-dimensional image according to the initial texture map and the visibility map corresponding to the current two-dimensional image, and continuing to execute the step until iteration is stopped to obtain the texture map of the target object.

In one embodiment, the computer device may weight each point on the initial texture map corresponding to the current two-dimensional image with the visibility of each point on the initial texture map to generate a texture map corresponding to the current two-dimensional image.

In one embodiment, for each frame of the two-dimensional image, the computer device may determine the visibility of each point on the initial texture map corresponding to the two-dimensional image according to the proximity between the normal vector direction of each surface vertex of the corresponding second parameterized model and the shooting direction of the two-dimensional image.

In the above embodiment, the fusion sequence corresponding to each frame of the two-dimensional image is determined according to the root node direction of each second parameterized model, and then the initial texture maps corresponding to each frame of the two-dimensional image are fused according to the fusion sequence to obtain the texture map of the target object, so that the efficiency of obtaining the texture map by fusion is improved, and the accuracy of the obtained texture map is improved.

In one embodiment, the method further comprises: and acquiring the visibility maps of the target objects corresponding to the two-dimensional images of the frames respectively. In this embodiment, fusing the initial texture maps corresponding to the two-dimensional images of each frame according to the fusion sequence to obtain the texture map of the target object includes: according to the fusion sequence, selecting a current two-dimensional image from a first two-dimensional image, and generating a texture map corresponding to the current two-dimensional image according to an initial texture map and a visibility map corresponding to the current two-dimensional image; fusing the texture map corresponding to the current two-dimensional image with the accumulated fused texture map; and after fusion, taking the next two-dimensional image as the current two-dimensional image according to the fusion sequence, iterating, returning to the step of generating the texture map corresponding to the current two-dimensional image according to the initial texture map and the visibility map corresponding to the current two-dimensional image, and continuing to execute the step until iteration is stopped to obtain the texture map of the target object.

And the visibility map is used for representing the visibility of each point on the initial texture map corresponding to the two-dimensional image. The texture map which is subjected to accumulative fusion is the texture map which is obtained by accumulative fusion before the initial texture map corresponding to the current two-dimensional image is fused.

Specifically, the computer device may use the first two-dimensional image as the current two-dimensional image, generate a texture map corresponding to the current two-dimensional image according to an initial texture map and a visibility map corresponding to the current two-dimensional image, then fuse the texture map corresponding to the current two-dimensional image with the accumulated fused texture maps, and after the fusion, use the next two-dimensional image as the current two-dimensional image according to the fusion sequence, iterate to return to the step of generating the texture map corresponding to the current two-dimensional image according to the initial texture map and the visibility map corresponding to the current two-dimensional image to continue the execution until the iteration is stopped, that is, the initial texture maps corresponding to the two-dimensional images of each frame are fused to complete, so as to obtain the texture map of the target object.

As shown in fig. 7, 702 is an initial texture map corresponding to each two-dimensional image of each frame, and 704 is a visibility map corresponding to each two-dimensional image of each frame, and after iterative texture fusion, a texture map 706 of the target object is obtained.

In one embodiment, the computer device may multiply the initial texture map and the visibility map corresponding to the current two-dimensional image to generate a texture map corresponding to the current two-dimensional image. It will be appreciated that multiplying the initial texture map and the visibility map corresponding to the current two-dimensional image corresponds to weighting the texture values of the points on the initial texture map with the visibility of the points on the visibility map.

In one embodiment, fusing the texture map corresponding to the current two-dimensional image with the cumulatively fused texture map may include: and multiplying the accumulated and fused texture map with the invisibility map corresponding to the current two-dimensional image, and adding the texture map corresponding to the current two-dimensional image.

In an embodiment, in the above embodiment, the iteration step of obtaining the texture map of the target object by fusing the initial texture maps corresponding to the two-dimensional images of the frames according to the fusion sequence may be represented by the following formula (7):

Γ'＝(1-α)×Γ+α×P； (7)

wherein α represents a visibility map corresponding to the current two-dimensional image, P represents an initial texture map corresponding to the current two-dimensional image, Γ represents an accumulatively fused texture map, and Γ' represents an accumulatively fused texture map obtained after fusing the initial texture map corresponding to the current two-dimensional image.

In one embodiment, for each frame of the two-dimensional image, the computer device may determine the visibility of each point on the initial texture map corresponding to the two-dimensional image according to the proximity between the normal vector direction of each surface vertex of the corresponding second parameterized model and the shooting direction of the two-dimensional image, and generate the visibility map corresponding to the two-dimensional image.

In the above embodiment, the initial texture maps corresponding to the two-dimensional images of the frames are iteratively and sequentially fused according to the fusion sequence, and the initial texture maps are weighted by the visibility map in the fusion process to obtain the texture map of the target object, so that the accuracy of the obtained texture map is improved.

In one embodiment, acquiring the visibility map corresponding to each two-dimensional image of each frame includes: generating a normal vector map corresponding to the corresponding two-dimensional image according to the normal vector of each second parameterized model surface; and aiming at each frame of two-dimensional image, determining the visibility of each point of the target object in the normal vector map according to the proximity degree between the normal vector direction in the normal vector map corresponding to the two-dimensional image and the shooting direction of the two-dimensional image, and obtaining the visibility map corresponding to the two-dimensional image.

And the normal vector map is used for representing the normal vector directions of surface vertexes of all points on the target object in the two-dimensional image, which correspond to the points in the corresponding second parameterized model.

In one embodiment, for each two-dimensional image, the computer device may generate a normal vector map corresponding to the two-dimensional image according to the normal vectors of the surface vertices of the corresponding second parameterized model, as shown at 606 in fig. 6 as the normal vector map.

In one embodiment, for each frame of two-dimensional image, the computer device may determine the visibility of each point in the initial texture map corresponding to the two-dimensional image according to the proximity between the normal vector direction of each point in the normal vector map corresponding to the two-dimensional image and the shooting direction of the two-dimensional image, so as to obtain the visibility map corresponding to the two-dimensional image. Specifically, the computer device may obtain the visibility corresponding to each point according to the proximity between the normal vector direction of each point in the normal vector map and the shooting direction of the two-dimensional image, to obtain a visibility map, where 608 in fig. 6 is the visibility map.

In one embodiment, the computer device may represent the degree of proximity in accordance with a cosine value between the normal vector direction and the capturing direction of the two-dimensional image.

In the above embodiment, the visibility map corresponding to the two-dimensional image is obtained according to the proximity between the normal vector direction in the normal vector map corresponding to the two-dimensional image and the shooting direction of the two-dimensional image, so that an accurate visibility map can be obtained, and the accuracy of the obtained texture map can be improved.

As shown in fig. 8, it can be seen that the reconstruction result is accurate for the target parameterized model obtained by three-dimensionally reconstructing 3 target human bodies by the target object three-dimensional reconstruction method in each embodiment of the present application, and the target parameterized model has detailed information such as clothes, hair, shoes and the like at each angle, and has complete and clear shape expression without incomplete or fuzzy parts.

In one embodiment, the target parameterized model is a drivable parameterized model. In this embodiment, the method further includes: acquiring action parameters; and substituting the action parameters into the target parameterized model to drive the target parameterized model to execute corresponding actions.

The drivable parameterized model refers to a parameterized model that can input motion parameters to drive the execution motion.

In one embodiment, the drivable parameterized models may include at least an SMPL model, an SMPLH model, an SMPLX model, and the like.

In one embodiment, the computer device may extract motion parameters from a motion video or motion image.

In one embodiment, the motion parameters may be captured by a motion capture device, and then the computer device may obtain the motion parameters captured by the motion capture device.

In another embodiment, the computer device may extract motion parameters of the parameterized model from the two-dimensional image through a motion capture algorithm and then substitute the motion parameters into the target parameterized model to drive the target parameterized model to perform corresponding motions. In one embodiment, the moving object detection algorithm may be the VIBE algorithm (a motion capture algorithm that takes motion parameters of the SMPL model by inputting a segment of video).

In one embodiment, if the target parameterized model is obtained based on SMPLX model deformation, the computer device may extract motion parameters of the SMPL model from the two-dimensional image by VIBE algorithm, and then convert the motion parameters of the SMPL model into motion parameters of the SMPLX model.

In one embodiment, the computer device may obtain a sequence of motion parameters, and substitute the sequence of motion parameters into the target parameterized model to obtain a plurality of frames of images of the target parameterized model performing the motion, thereby forming a continuous video in which the target parameterized model performs a series of motions.

In one embodiment, the computer device may perform texture rendering on the image of the target parameterized model performing the action according to the texture map, resulting in an image of the target parameterized model performing the action with the texture map. As shown in fig. 9, the target parameterized model with texture maps is driven to perform an action according to the sequence 902 of action parameters in the figure, resulting in a plurality of frames of images 904 of the target parameterized model performing the action.

In the embodiment, the obtained target parameterized model can execute corresponding actions according to the input action parameters to realize action driving, so that the problem that the three-dimensional continuous surface has detail information but cannot realize action driving is solved, the obtained target parameterized model has the detail information and can realize action driving, the applicability of the target parameterized model obtained by three-dimensional reconstruction is improved, and the application range is widened.

Fig. 10 is a schematic overall flow chart of a target object three-dimensional reconstruction method in the embodiments of the present application. The computer device can acquire a single-frame two-dimensional image of the target object, then create a first parameterized model according to the single-frame two-dimensional image, generate a three-dimensional continuous surface, and divide a three-dimensional grid surface from the three-dimensional continuous surface, and adjust model parameters of the first parameterized model so as to perform registration processing on the grid surface of the first parameterized model and the three-dimensional grid surface of the three-dimensional continuous surface, thereby obtaining the target parameterized model. In the registration process, the computer device may fix the parameter values of the partial offset parameters according to the surface vertex semantic segmentation result of the first parameterized model. The steps of determining the result of the semantic segmentation of the surface vertices of the first parameterized model are as follows: the computer equipment can acquire a plurality of frames of two-dimensional images of the target object, respectively create a second parameterized model of each frame of two-dimensional image, and then obtain a surface vertex semantic segmentation result of the first parameterized model according to the semantic segmentation result of each frame of two-dimensional image and the corresponding second parameterized model. After the target parameterized model is obtained, the computer equipment can perform action driving on the target parameterized model through the action parameters and perform texture rendering on the target parameterized model according to the texture maps. The texture mapping is determined as follows: the computer equipment can obtain multiple frames of two-dimensional images of the target object, respectively create second parameterized models of the two-dimensional images, then generate initial texture maps and visibility maps corresponding to the two-dimensional images according to the two-dimensional images and the corresponding second parameterized models, and then iteratively fuse the initial texture maps after weighting the visibility maps according to a fusion sequence to obtain the texture maps.

The application scene is used for carrying out three-dimensional reconstruction on a target human body, and the target object three-dimensional reconstruction method is applied to the application scene. Specifically, the application of the target object three-dimensional reconstruction method in the application scene is as follows:

the method comprises the steps that a video of a target human body rotating for one circle is shot through a camera, then a front two-dimensional image is selected from a plurality of frames of two-dimensional images of the video, a computer device can create a three-dimensional parameterized model of the target human body according to the front two-dimensional image of the target human body, a three-dimensional continuous surface of the target human body is generated according to the two-dimensional image, grid division is carried out on the three-dimensional continuous surface to obtain a three-dimensional grid surface, then the computer device can adjust model parameters of the parameterized model to carry out registration processing on the grid surface and the three-dimensional grid surface in the parameterized model to obtain a final target parameterized model of the target human body, and three-dimensional reconstruction of the target human body is achieved.

Further, the computer device may further obtain a texture map of the target human body according to the multiple frames of two-dimensional images of the video and the parameterized model of the target human body correspondingly created according to each frame of two-dimensional image, and perform texture rendering on the target parameterized model according to the texture map to obtain a three-dimensional reconstruction result of the target human body with the texture map, such as: a three-dimensional reconstruction result of the target human body with a pattern of clothes or the like worn by the target human body.

Further, the computer device may further extract motion parameters from the motion video for motion reference, substitute the motion parameters into the target parameterized model to drive the target parameterized model to execute corresponding motions, and perform texture rendering on the target parameterized model executing the motions according to the texture maps to obtain an image of a three-dimensional reconstruction result execution motion of the target human body to be texture mapped, or a continuous segment of the motion execution video composed of multiple frames of images executing the motions. For example, the motion parameters can be extracted from the dance video, and finally the video in which the three-dimensional reconstruction result of the target human body dances according to the motion in the dance video is obtained.

The application scene is a scene for performing three-dimensional reconstruction on the target animal, and the application scene applies the target object three-dimensional reconstruction method. Specifically, the application of the target object three-dimensional reconstruction method in the application scene is as follows:

the method comprises the steps that a multi-frame two-dimensional image of a target animal is acquired by an image acquisition device through rotating around the target animal, then a front two-dimensional image is selected from the multi-frame two-dimensional image, a three-dimensional parameterized model of the target animal can be created by a computer device according to the front two-dimensional image of the target animal, a three-dimensional continuous surface of the target animal is generated according to the two-dimensional image, grid division is carried out on the three-dimensional continuous surface to obtain a three-dimensional grid surface, then model parameters of the parameterized model can be adjusted by the computer device to carry out registration processing on the grid surface and the three-dimensional grid surface in the parameterized model to obtain a final target parameterized model of the target animal, and three-dimensional reconstruction of the target animal is achieved.

Further, the computer device may further obtain a texture map of the target animal according to the multiple frames of two-dimensional images of the video and the parameterized model of the target animal correspondingly created according to each frame of two-dimensional image, and perform texture rendering on the parameterized model of the target according to the texture map to obtain a three-dimensional reconstruction result of the target animal with the texture map, such as: and (3) a three-dimensional reconstruction result of the target animal with a pattern such as a hair color line of the target animal.

Further, the computer device can also extract motion parameters from the motion video for motion reference, substitute the motion parameters into the target parameterized model to drive the target parameterized model to execute corresponding motions, and perform texture rendering on the target parameterized model executing the motions according to the texture maps to obtain an image of a three-dimensional reconstruction result execution motion of the target animal to be texture mapped, or a continuous segment of the motion execution video consisting of multiple frames of images executing the motions. The action video can be an animal action video or a human action video, for example, action parameters can be extracted from the human action video, and finally a three-dimensional reconstruction result of the target animal is obtained to execute the action video according to the human action video, so that an animal anthropomorphic effect is realized.

It should be understood that, although the steps in the flowcharts are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in each flowchart may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.

In one embodiment, as shown in fig. 11, there is provided an apparatus 1100 for three-dimensional reconstruction of a target object, which may be a part of a computer device using software modules or hardware modules, or a combination of the two modules, and specifically includes: an image acquisition module 1102, a parameterized model creation module 1104, a continuous surface generation module 1106, a meshing module 1108, and a model parameter adjustment module 1110, wherein:

an image acquisition module 1102 for acquiring a two-dimensional image including a target object.

A parameterized model creation module 1104 for creating a three-dimensional parameterized model of the target object from the two-dimensional image.

A continuous surface generation module 1106 for generating a three-dimensional continuous surface of the target object from the two-dimensional image; the three-dimensional continuous surface is a three-dimensional surface obtained by continuously representing the surface of the target object.

And a mesh division module 1108, configured to perform mesh division on the three-dimensional continuous surface to obtain a three-dimensional mesh surface.

The model parameter adjusting module 1110 is configured to adjust model parameters of the parameterized model, so as to perform registration processing on a mesh surface and a three-dimensional mesh surface in the parameterized model, to obtain a final target parameterized model of the target object.

In one embodiment, the model parameter adjusting module 1110 is further configured to add an offset parameter to the parameterized model to obtain a deformable parameterized model, and adjust the offset parameter corresponding to the surface vertex of the deformable parameterized model to perform a registration process on the mesh surface in the parameterized model and the three-dimensional mesh surface.

In one embodiment, the model parameters of the deformable parameterized model further include global feature parameters, pose parameters, and shape parameters. In this embodiment, the model parameter adjusting module 1110 is further configured to adjust global feature parameters of the deformable parameterized model, so as to perform depth registration on the mesh surface in the parameterized model and the three-dimensional mesh surface; adjusting the posture parameters and the shape parameters of the parameterized model after the depth registration so as to perform registration of the mesh surface in the parameterized model and the three-dimensional mesh surface; and adjusting the offset parameters corresponding to the surface vertexes of the superposed and registered parameterized model so as to register the characteristic parts of the mesh surface and the three-dimensional mesh surface in the parameterized model.

In one embodiment, the model parameter adjustment module 1110 is further configured to obtain a surface vertex semantic segmentation result of the parameterized model; determining surface vertexes with severe change of geometric shape in the deformable parameterized model according to the semantic segmentation result of the surface vertexes; and fixing the offset parameters corresponding to the determined surface vertices, and adjusting the offset parameters corresponding to the surface vertices except the determined surface vertices in the deformable parameterized model so as to perform registration processing on the mesh surface in the parameterized model and the three-dimensional mesh surface.

In one embodiment, the two-dimensional image is a single frame two-dimensional image, and the parameterized model created from the single frame two-dimensional image is the first parameterized model. In this embodiment, the apparatus 1100 for three-dimensional reconstruction of a target object further includes:

a semantic segmentation module 1112, configured to obtain a multi-frame two-dimensional image including a target object; acquiring a second parameterized model of the target object correspondingly created according to each two-dimensional image in the multiple two-dimensional images; semantic segmentation is carried out on each frame of two-dimensional image in the multi-frame of two-dimensional image to obtain a partially visible surface vertex semantic segmentation result of the corresponding second parameterized model; and determining the surface vertex semantic segmentation result of the first parameterized model according to the partially visible surface vertex semantic segmentation result of each second parameterized model.

In one embodiment, semantic segmentation module 1112 is further configured to determine invisible surface vertices in the first parameterized model for which no semantic segmentation result is determined; determining visible surface vertexes within a preset neighborhood range of the invisible surface vertexes; and determining the semantic segmentation result of the invisible surface vertex according to the determined semantic segmentation result of the visible surface vertex.

In one embodiment, the model parameter adjustment module 1110 is further configured to obtain a multi-term constrained target loss function; the target loss function comprises a grid difference loss function, a surface topological structure loss function and a deformation loss function; iteratively adjusting model parameters of the parameterized model toward minimizing the objective loss function to register a mesh surface in the parameterized model with the three-dimensional mesh surface.

a texture rendering module 1114, configured to obtain a multi-frame two-dimensional image including a target object; acquiring a second parameterized model of the target object correspondingly created according to each two-dimensional image in the multiple two-dimensional images; mapping points corresponding to the target object in the corresponding two-dimensional image to a texture space according to texture coordinates corresponding to the surface vertex of each second parameterized model to obtain initial texture maps of the target object corresponding to each two-dimensional image frame; fusing the initial texture maps to obtain a texture map of the target object; and performing texture rendering on the target parameterized model according to the texture map.

In one embodiment, the texture rendering module 1114 is further configured to determine a corresponding fusion order of the two-dimensional images according to a root node direction of each second parameterized model; and fusing the initial texture maps corresponding to the two-dimensional images of each frame according to the fusion sequence to obtain the texture map of the target object.

In one embodiment, the texture rendering module 1114 is further configured to obtain a visibility map of a target object corresponding to each frame of the two-dimensional image; fusing the initial texture maps corresponding to the two-dimensional images of each frame according to the fusion sequence to obtain the texture map of the target object, wherein the method comprises the following steps: according to the fusion sequence, selecting a current two-dimensional image from a first two-dimensional image, and generating a texture map corresponding to the current two-dimensional image according to an initial texture map and a visibility map corresponding to the current two-dimensional image; fusing the texture map corresponding to the current two-dimensional image with the accumulated fused texture map; and after fusion, taking the next two-dimensional image as the current two-dimensional image according to the fusion sequence, iterating, returning to the step of generating the texture map corresponding to the current two-dimensional image according to the initial texture map and the visibility map corresponding to the current two-dimensional image, and continuing to execute the step until iteration is stopped to obtain the texture map of the target object.

In one embodiment, the texture rendering module 1114 is further configured to generate a corresponding normal vector map for the corresponding two-dimensional image according to the normal vector of each second parameterized model surface; and aiming at each frame of two-dimensional image, determining the visibility of each point of the target object in the normal vector map according to the proximity degree between the normal vector direction in the normal vector map corresponding to the two-dimensional image and the shooting direction of the two-dimensional image, and obtaining the visibility map corresponding to the two-dimensional image.

In one embodiment, the target parameterized model is a drivable parameterized model. In this embodiment, as shown in fig. 12, the target object three-dimensional reconstruction apparatus 1100 further includes:

an action driving module 1116 for acquiring action parameters; and substituting the action parameters into the target parameterized model to drive the target parameterized model to execute corresponding actions.

In the three-dimensional reconstruction device for the target object, a three-dimensional parameterized model of the target object is created according to a two-dimensional image of the target object, a three-dimensional continuous surface of the target object is generated according to the two-dimensional image, the three-dimensional continuous surface is subjected to grid division to obtain a three-dimensional grid surface, then model parameters of the parameterized model are adjusted to perform registration processing on the grid surface and the three-dimensional grid surface in the parameterized model, and a final target parameterized model of the target object is obtained. Because the three-dimensional continuous surface has the detail information of the target object, the target parameterized model obtained by performing the registration processing on the mesh surface and the three-dimensional mesh surface in the parameterized model has stronger shape expression capability on the target object, and can have the detail information of the target object, such as: the method avoids the limitation that the parameterized model of the target object obtained by three-dimensional reconstruction of the parameterized model has limited shape expression of the target object and lacks of detailed information, and improves the accuracy of three-dimensional reconstruction of the target object. Also, parameterized models tend to be ambiguous in depth. For example, the legs of the parameterized model of a short target person are often incomplete, and the target parameterized model obtained by the method avoids depth ambiguity and improves the accuracy of three-dimensional reconstruction.

For specific limitations of the target object three-dimensional reconstruction apparatus, reference may be made to the above limitations of the target object three-dimensional reconstruction method, which are not described herein again. The modules in the above target object three-dimensional reconstruction apparatus may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 13. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing three-dimensional reconstruction data of the target object. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method for three-dimensional reconstruction of a target object.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 14. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method for three-dimensional reconstruction of a target object. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the configurations shown in fig. 13 and 14 are block diagrams of only some of the configurations relevant to the present disclosure, and do not constitute a limitation on the computing devices to which the present disclosure may be applied, and that a particular computing device may include more or less components than those shown, or combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method for three-dimensional reconstruction of a target object, the method comprising:

acquiring a two-dimensional image including a target object;

generating a three-dimensional continuous surface of the target object according to the two-dimensional image, wherein the three-dimensional continuous surface is a three-dimensional surface obtained by continuously representing the surface of the target object;

2. The method of claim 1, further comprising:

3. The method of claim 2, wherein the model parameters of the deformable parameterized model further comprise global feature parameters, pose parameters, and shape parameters;

4. The method of claim 2, wherein said adjusting offset parameters corresponding to surface vertices of said deformable parameterized model to register mesh surfaces in said parameterized model with said three-dimensional mesh surface comprises:

5. The method of claim 4, wherein the two-dimensional image is a single frame two-dimensional image; the parameterized model created according to the single-frame two-dimensional image is a first parameterized model;

acquiring a multi-frame two-dimensional image comprising the target object;

semantic segmentation is carried out on each frame of two-dimensional image in the multi-frame of two-dimensional image to obtain a partially visible surface vertex semantic segmentation result of the corresponding second parameterized model;

and determining the surface vertex semantic segmentation result of the first parameterized model according to the partially visible surface vertex semantic segmentation result of each second parameterized model.

6. The method of claim 5, wherein after determining the surface vertex semantic segmentation result for the first parameterized model from the surface vertex semantic segmentation results for each of the second parameterized models, the method further comprises:

7. The method of claim 1, wherein the adjusting model parameters of the parameterized model to register mesh surfaces in the parameterized model with the three-dimensional mesh surface comprises:

8. The method of claim 1, wherein the two-dimensional image is a single frame two-dimensional image; the parameterized model created according to the single-frame two-dimensional image is a first parameterized model;

the method further comprises the following steps:

acquiring a multi-frame two-dimensional image comprising the target object;

fusing the initial texture maps to obtain a texture map of the target object;

9. The method according to claim 8, wherein said fusing each of the initial texture maps to obtain the texture map of the target object comprises:

10. The method of claim 9, further comprising:

11. The method according to claim 10, wherein the obtaining of the visibility map corresponding to each two-dimensional image of each frame comprises:

12. The method according to any one of claims 1 to 11, wherein the target parameterized model is a drivable parameterized model;

the method further comprises the following steps:

acquiring action parameters;

13. An apparatus for three-dimensional reconstruction of a target object, the apparatus comprising:

14. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 12.

15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 12.