CN116228994A

CN116228994A - Three-dimensional model acquisition method, device, equipment and storage medium

Info

Publication number: CN116228994A
Application number: CN202310513127.2A
Authority: CN
Inventors: 龙霄潇; 林铖
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2023-05-09
Filing date: 2023-05-09
Publication date: 2023-06-06
Anticipated expiration: 2043-05-09
Also published as: CN116228994B

Abstract

The application relates to a three-dimensional model acquisition method, a device, equipment and a storage medium, and relates to the technical field of artificial intelligence. The method comprises the following steps: acquiring N object pictures; acquiring geometric features of a target object and texture features of the target object based on the N object pictures; generating N reconstructed pictures of the target object based on the geometric features of the target object and the texture features of the target object; performing iterative optimization on geometric features of the target object and texture features of the target object based on N reconstructed pictures of the target object and differences among the N object pictures; and constructing a three-dimensional model of the target object based on the geometric features of the target object in response to the optimization of the geometric features of the target object being completed. The scheme can improve the modeling effect of the three-dimensional object.

Description

Three-dimensional model acquisition method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for acquiring a three-dimensional model.

Background

The three-dimensional object reconstruction refers to a process of obtaining a three-dimensional model of an object in two-dimensional pictures by processing the two-dimensional pictures of different shooting angles.

In the related art, the depth geometric estimation of each image can be performed on the input image by a multi-view three-dimensional depth estimation method, then the depth images corresponding to the acquired images are fused, the point cloud structure of the object to be reconstructed is finally acquired, and the grid surface of the geometric structure of the object is recovered from the point cloud by using poisson reconstruction and other methods.

However, the scheme shown in the related art relies on the image quality of the multi-view image, is generally sensitive to illumination variation, and is difficult to realize high-quality reconstruction of an object, and has poor modeling effect.

Disclosure of Invention

The embodiment of the application provides a three-dimensional model acquisition method, a device, equipment and a storage medium, which can improve the modeling effect of a three-dimensional object, and the technical scheme is as follows:

in one aspect, a method for obtaining a three-dimensional model is provided, the method comprising:

acquiring N object pictures; the N object pictures are pictures obtained by carrying out image acquisition on the same target object at different angles; n is an integer greater than or equal to 2;

based on the N object pictures, obtaining the geometric characteristics of the target object and the texture characteristics of the target object;

Generating N reconstructed pictures of the target object based on the geometric features of the target object and the texture features of the target object;

iteratively optimizing geometric features of the target object and texture features of the target object based on N reconstructed pictures of the target object and differences among the N object pictures;

and constructing a three-dimensional model of the target object based on the geometric features of the target object in response to the optimization of the geometric features of the target object being completed.

In another aspect, there is provided a three-dimensional model acquisition apparatus including:

the image acquisition module is used for acquiring N object images; the N object pictures are pictures obtained by carrying out image acquisition on the same target object at different angles; n is an integer greater than or equal to 2;

the feature extraction module is used for acquiring geometric features of the target object and texture features of the target object based on the N object pictures;

the image reconstruction module is used for generating N reconstructed images of the target object based on the geometric characteristics of the target object and the texture characteristics of the target object;

The optimization module is used for carrying out iterative optimization on the geometric characteristics of the target object and the texture characteristics of the target object based on N reconstructed pictures of the target object and differences among the N object pictures;

and the model construction module is used for constructing a three-dimensional model of the target object based on the geometric characteristics of the target object in response to the completion of the geometric characteristic optimization of the target object.

In a possible implementation manner, the feature extraction module is configured to input the N object pictures into feature extraction branches in a picture reconstruction model, and obtain geometric features of the target object and texture features of the target object output by the feature extraction branches of the picture reconstruction model;

the image reconstruction module is used for generating the N reconstructed images based on the undirected distance field corresponding to the geometric features of the target object and the texture features of the target object through the volume rendering branches in the image reconstruction model;

the optimizing module is used for optimizing the output of the computer,

acquiring a loss function value based on N reconstructed pictures of the target object and differences among the N object pictures;

Updating model parameters of the picture reconstruction model based on the loss function value;

and inputting the N object pictures into the picture reconstruction model with updated parameters to obtain new geometric characteristics of the target object and new texture characteristics of the target object, wherein the new geometric characteristics are output by a feature extraction branch of the picture reconstruction model.

In one possible implementation manner, the image reconstruction module is further configured to generate, through a volume rendering branch in the image reconstruction model, N new reconstructed images based on an undirected distance field corresponding to a new geometric feature of the target object and a new texture feature of the target object;

the apparatus further comprises:

the optimization module is further configured to determine that the geometric feature optimization of the target object is completed in response to the differences between the N new reconstructed pictures and the N object pictures meeting convergence conditions;

the optimizing module is further configured to obtain a new loss function value based on the N new reconstructed pictures of the target object and the differences between the N object pictures, and update model parameters of the image reconstruction model based on the new loss function value, in response to the N new reconstructed pictures and the differences between the N object pictures not meeting the convergence condition.

In one possible implementation, the volume rendering formula used by the volume rendering branch includes a density function;

the density function is used to translate the geometric features of the target object into a probability density.

In one possible implementation, the geometric feature is characterized based on an undirected distance field; the density function is used to convert the undirected distance values in the geometric feature into probability densities by a differentiable indicator function.

In one possible implementation, the loss function value includes at least one of a first loss function value and a second loss function value:

the first loss function value is a function value obtained based on a color difference between corresponding pixels between the N reconstructed pictures and the N object pictures;

the second loss function value is a function value obtained based on a color difference between the N reconstructed pictures and the N object pictures, and between corresponding patches.

In one possible implementation, the loss function value further includes at least one of a third loss function value, a fourth loss function value, and a fifth loss function value:

the third loss function value is used for optimizing the module length of the undirected distance field;

The fourth loss function value is used to optimize the complexity of the undirected distance field;

the fifth loss function value is used to optimize the contour of the undirected distance field.

In one possible implementation, the third loss function value is calculated based on a difference between a mode length of the undirected distance field and 1.

In one possible implementation, the fourth loss function value is used to exclude sampling points of non-object surfaces in the undirected distance field.

In one possible implementation, the fifth loss function value is calculated based on a difference between the contour of the undirected distance field and the contour of the target object.

In one possible implementation, the model building module is configured to extract an explicit mesh surface of the target object from the geometric features of the target object in response to the optimization of the geometric features of the target object being completed.

In another aspect, a computer device is provided that includes a processor and a memory storing instructions stored by at least one computer that are loaded and executed by the processor to implement the three-dimensional model acquisition method described above.

In another aspect, a computer readable storage medium having stored therein at least one computer instruction loaded and executed by a processor to implement the three-dimensional model acquisition method described above is provided.

In another aspect, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the three-dimensional model acquisition method provided in the above-described various alternative implementations.

The technical scheme that this application provided can include following beneficial effect:

acquiring geometric features and texture features of the target object based on N object pictures obtained by carrying out image acquisition on the same target object at different angles; generating N reconstructed pictures of the target object based on the geometric features of the target object and the texture features of the target object; optimizing the geometric characteristics of the target object and the texture characteristics of the target object based on N reconstructed pictures of the target object and differences among the N object pictures, and constructing a three-dimensional model of the target object based on the geometric characteristics of the target object when the geometric characteristics of the target object are optimized; in the above scheme, after the initial geometric features and texture features of the target object are generated through the N object pictures, the N object pictures can be used as references, and iteration optimization is continuously performed through the geometric features and texture features of the target object, so that the accuracy of the geometric features of the target object is improved, and the modeling effect of the three-dimensional object is further improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

FIG. 1 is a schematic diagram of a system according to an exemplary embodiment of the present application;

FIG. 2 is a flow chart of a three-dimensional model acquisition method according to an exemplary embodiment of the present application;

FIG. 3 is a frame diagram of a three-dimensional model reconstruction according to the present application;

FIG. 4 is a flow chart of another three-dimensional model acquisition method shown in an exemplary embodiment of the present application;

FIG. 5 is a schematic diagram of a model structure according to the present application;

FIG. 6 is a flow chart of yet another three-dimensional model acquisition method shown in an exemplary embodiment of the present application;

FIG. 7 is a frame diagram of another three-dimensional model reconstruction according to the present application;

FIG. 8 is a schematic diagram of modeling results of a conventional modeling scheme;

FIG. 9 is a schematic representation of modeling results of an open-boundary object based on an undirected distance field according to the present application;

FIG. 10 is a schematic representation of modeling results of a closed boundary object based on an undirected distance field according to the present application;

FIG. 11 is a block diagram of a three-dimensional model acquisition device provided in one embodiment of the present application;

Fig. 12 shows a block diagram of a computer device according to an exemplary embodiment of the present application.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.

The embodiment of the application provides a method for reconstructing a three-dimensional model through a two-dimensional image. For ease of understanding, several terms referred to in this application are explained below.

1) Artificial intelligence (Artificial Intelligence, AI): artificial intelligence is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and expand human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions. The display device comprising an image acquisition component shown in the application mainly relates to the computer vision technology and the directions of machine learning/deep learning, automatic driving, intelligent traffic and the like.

2) Computer Vision technology (CV): the computer vision is a science for researching how to make a machine "see", and more specifically, a camera and a computer are used to replace human eyes to identify and measure targets, and the like, and further, graphic processing is performed, so that the computer is processed into images which are more suitable for human eyes to observe or transmit to an instrument to detect. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision techniques typically include image processing, image recognition, image semantic understanding, image retrieval, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D techniques, virtual reality, augmented reality, synchronous positioning, map construction, and other techniques, as well as common biometric recognition techniques such as face recognition, fingerprint recognition, and the like.

3) Machine Learning (ML): machine learning is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

4) Implicit neural representation (Implicit Neural Representations): the method is a method of representing signals such as an input image, audio, and point cloud as a function by means of a neural network. Finding a suitable network F for the input x enables the network F to characterize the function Φ such that the original signal is continuous, differentiable, as the function Φ is continuous. The advantage of this is that more efficient memory management can be achieved, finer signal details are achieved, and the image is still resolved in the case of higher-order differentiation, and a completely new tool is provided for solving the inverse problem.

5) Neural rendering (neural rendering): neural rendering is a method based on a deep neural network and a physics engine that can create novel images and video clips from existing scenes. It allows the user to control scene properties such as lighting, camera parameters, gestures, geometry, shape and semantic structure.

6) Volume rendering (volume rendering): in scientific visualization and computer graphics, volume rendering is a set of techniques for displaying 2D projections of 3D discrete sampled datasets (typically 3D scalar fields).

7) Directed distance field (Signed Distance Function, SDF): in mathematics and its application, a signed distance function (or directed distance function) is the orthogonal distance from a given point x in metric space to the boundary of the set Ω, with the sign being internally determined by x. The function has a positive value at a point x within Ω, its value decreases as x approaches the boundary of Ω (the sign distance function is zero), and it takes a negative value outside Ω. However, sometimes an alternate convention is employed (i.e., Ω is negative inside and positive outside).

8) Undirected distance field (Unsigned Distance Function, UDF): in mathematics and its application, the unsigned distance function is the orthogonal distance from a given point x in metric space to the boundary of the set Ω.

Fig. 1 is a schematic diagram of a system used in the three-dimensional model acquisition method according to an exemplary embodiment of the present application, and as shown in fig. 1, the system includes: server 110 and terminal 120.

The server 110 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms.

The terminal 120 may be a terminal device having a network connection function and a data processing function, for example, the terminal 120 may be a smart phone, a tablet computer, an electronic book reader, smart glasses, a smart watch, a smart television, a laptop portable computer, a desktop computer, and the like.

Optionally, the system includes one or more servers 110 and a plurality of terminals 120. The number of servers 110 and terminals 120 is not limited in the embodiment of the present application.

The terminal and the server are connected through a communication network. Optionally, the communication network is a wired network or a wireless network.

At least one of the terminal and the server may store/run an application program such as a modeling tool or plug-in for modeling a three-dimensional object through a two-dimensional picture.

For example, an application installation package may be provided in the server 110, and the terminal 120 may download the application installation package from the server 110 and install the application; the subsequent terminal 120 models the three-dimensional object with the two-dimensional picture through the application.

Only one terminal is shown in fig. 1, but in different embodiments there are a number of other terminals that can access the server 110. Optionally, there are one or more terminals corresponding to the developer, on which a development and editing platform for developing the application program is installed, the developer may edit and update the application program on the terminal, and transmit the updated application program installation package to the server 110 through a wired or wireless network, and the terminal 120 may download the application program installation package from the server 110 to implement update of the application program.

For another example, the server 110 may install and run the application, the terminal 120 may send a two-dimensional picture to the server 110, and the server 110 models the three-dimensional object by using the received two-dimensional picture through the application, and returns the modeling result to the terminal 120.

Alternatively, the wireless network or wired network described above uses standard communication techniques or protocols. The network is typically the Internet, but may be any network including, but not limited to, a local area network (Local Area Network, LAN), metropolitan area network (Metropolitan Area Network, MAN), wide area network (Wide Area Network, WAN), a mobile, wired or wireless network, a private network, or any combination of virtual private networks. In some embodiments, data exchanged over the network is represented using techniques or formats of HyperText Mark-up Language (HTML), extensible markup Language (Extensible Markup Language, XML), and the like. In addition, all or some of the links may be encrypted using conventional encryption techniques such as secure socket layer (Secure Socket Layer, SSL), transport layer security (Transport Layer Security, TLS), virtual private network (Virtual Private Network, VPN), internet protocol security (Internet Protocol Security, IPsec), and the like. In other embodiments, custom or dedicated data communication techniques may also be used in place of or in addition to the data communication techniques described above. The application is not limited herein.

Fig. 2 shows a flowchart of a three-dimensional model acquisition method shown in an exemplary embodiment of the present application, which is performed by a computer device, which may be implemented as a terminal or a server, which may be the terminal or the server shown in fig. 1, as shown in fig. 2, and the three-dimensional model acquisition method includes the following steps.

Step 210: acquiring N object pictures; the N object pictures are pictures obtained by carrying out image acquisition at different angles on the same target object; n is an integer greater than or equal to 2.

In the embodiment of the present application, for a target object to be reconstructed, image acquisition may be performed on the target object through N different angles in advance, so as to obtain N object pictures.

Step 220: and acquiring geometric features of the target object and texture features of the target object based on the N object pictures.

Wherein the geometric feature may be a feature for representing the geometry of the target object.

The geometric characteristics of the target object refer to the three-dimensional morphology and structure characteristics of the target object. For example, the geometric features of the target object described above may be used to represent the shape and size of the target object.

The texture feature may be texture information for characterizing the surface of the target object.

Step 230: n reconstructed pictures of the target object are generated based on the geometric features of the target object and the texture features of the target object.

In the embodiment of the present application, since the geometric features may represent the geometric structure of the target object, and the texture features may represent the texture (equivalent to color, texture, etc.) of the surface of the target object, a two-dimensional image (i.e., the reconstructed image) for observing the target object at a specified viewing angle may be generated through the geometric features and the texture features of the target object.

In the embodiment of the application, the computer device may generate N reconstructed pictures of the target object based on the geometric features of the target object and the texture features of the target object by means of volume rendering and the like.

Step 240: and optimizing the geometric characteristics of the target object and the texture characteristics of the target object based on the N reconstructed pictures of the target object and the differences among the N object pictures.

The N reconstructed pictures and the N object pictures are in a one-to-one correspondence, specifically, for one reconstructed picture and a corresponding object picture, the camera parameters of the object corresponding to the two reconstructed pictures are consistent, and the more accurate the geometric features and the texture features of the object are, the more similar the corresponding reconstructed pictures and the object pictures are, and the computer equipment can optimize the geometric features of the object and the texture features of the object according to the generated reconstructed pictures and the differences between the object pictures corresponding to the reconstructed pictures, so that the new reconstructed picture obtained by reconstructing the optimized geometric features and texture features can be more similar to the corresponding object picture.

The above-mentioned steps 220 to 240 may be performed iteratively, that is, after the optimized geometric features and texture features are reconstructed again to obtain a new reconstructed picture, the geometric features and texture features are optimized again according to the difference between the new reconstructed picture and the corresponding object picture, and the above-mentioned steps are repeatedly performed until convergence conditions are reached.

Wherein, the above convergence condition may include: the iteration times reach a time threshold, the difference between the geometric features and the texture features before and after the two adjacent optimizations is smaller than a feature difference threshold, the difference between reconstructed pictures obtained before and after the two adjacent optimizations is smaller than a first picture difference threshold, the difference between reconstructed pictures obtained after the last optimization and corresponding object pictures is smaller than a second picture difference threshold, and the like. The embodiments of the present application do not limit the above convergence conditions.

Step 250: and constructing a three-dimensional model of the target object based on the geometric features of the target object in response to the optimization of the geometric features of the target object being completed.

In the embodiment of the application, after the optimization of the geometric features of the target object is completed, the computer device may construct a corresponding three-dimensional model based on the geometric features of the target object after the last optimization.

The above-mentioned construction of the three-dimensional model may refer to generating model data for characterizing the three-dimensional model of the target object based on the geometric features of the target object.

In summary, according to the scheme shown in the embodiment of the present application, based on N object pictures obtained by performing image acquisition at different angles on the same target object, geometric features and texture features of the target object are obtained; generating N reconstructed pictures of the target object based on the geometric features of the target object and the texture features of the target object; optimizing the geometric characteristics of the target object and the texture characteristics of the target object based on N reconstructed pictures of the target object and differences among the N object pictures, and constructing a three-dimensional model of the target object based on the geometric characteristics of the target object when the geometric characteristics of the target object are optimized; in the above scheme, after the initial geometric features and texture features of the target object are generated through the N object pictures, the N object pictures can be used as references, and iteration optimization is continuously performed through the geometric features and texture features of the target object, so that the accuracy of the geometric features of the target object is improved, and the modeling effect of the three-dimensional object is further improved.

Based on the scheme shown in fig. 2, in one possible implementation, the step 230 may be replaced by: generating N reconstructed pictures of the target object based on the undirected distance field corresponding to the geometric features of the target object and the texture features of the target object.

In one possible implementation, the geometry of the object may be characterized by an undirected distance field. That is, the geometric features of the target object include an undirected distance field.

In the present embodiment, the field is a quantity (scalar) defined for at least one of all (continuous) spatial and temporal coordinates, such as an electromagnetic field, a gravitational field, etc. That is, a field is a continuous concept that maps a vector of high dimension to a scalar.

Wherein the undirected distance field is a neural field, which is a field that is fully or partially parameterized with a neural network.

In the field of vision, neural fields can be considered as a process of generating a target scalar (e.g., color, depth, etc.) by modeling an objective function through a network of multi-layer perceptrons (Multilayer Perceptron, MLP) with information of spatial coordinates and other dimensions (time, camera pose, etc.) as input.

Since the directed distance field needs to divide the space into an interior and an exterior, only closed surface objects can be modeled, and open boundary objects cannot be reconstructed (specifically, open boundary objects have no interior and exterior division), such as objects like clothing, for which if the directed distance field is used as an expression form of the shape, the open boundary objects are erroneously modeled as closed surface objects, and correct geometric topology cannot be reflected.

The undirected distance field is an implicit expression of the surface of the target object in three-dimensional space, and is a spatial field in which each voxel (volume unit corresponding to a pixel) records a minimum distance between itself and the surface (also called a boundary) of the target object; the position of the boundary of the target object can be determined through the minimum distance between each point in the three-dimensional space and the surface of the target object, so that the geometric structures such as the shape, the size and the like of the target object are represented.

Specifically, the undirected distance field of the target object may be represented by a distance field function, to which coordinates of any one point in the three-dimensional space are input, and the distance field function may calculate an orthogonal distance between the point and a boundary set of the target object in the three-dimensional space (or a nearest distance between the point and a surface of the target object).

Since the undirected distance field only records the nearest distance from a certain point in space to the surface of the object, no sign is recorded, so that the space inside and outside do not need to be divided, and the open boundary object can be accurately modeled.

In addition to the Undirected Distance Field (UDF), a truncated directed distance field (Truncated Signed Distance Function, TSDF) can be used as a geometric representation of the object, wherein as a geometric representation of the object, TSDF records the truncated symbol distance of the obstacle surface (i.e. the object surface) in the beam direction, and the surface with TSDF value 0 is the obstacle surface.

The truncated directed distance field is also an implicit expression of the surface of the target object in three-dimensional space, the truncated directed field is also a spatial field in which each voxel (volume unit to which the pixel corresponds, i.e. a point in space) records a truncated symbolic distance between itself and the surface (also called boundary) of the target object (when the distance exceeds a threshold value), and the truncated symbolic distance has a sign, the positive sign indicating that the voxel is outside the target object and the negative sign indicating that the voxel is inside the target object; the position of the boundary of the target object can be determined through the cut-off sign between each point in the three-dimensional space and the surface of the target object, so that the representation of geometric structures such as the shape, the size and the like of the target object is realized.

Specifically, the truncated directed distance field of the target object may also be represented by a distance field function, to which the coordinates of any one point in the three-dimensional space are input, and the distance field function may calculate the truncated symbol distance between the point and the set of boundaries of the target object in the three-dimensional space.

In summary, according to the scheme shown in the embodiment of the present application, based on N object pictures obtained by performing image acquisition at different angles on the same target object, geometric features and texture features of the target object are obtained; generating N reconstructed pictures of the target object based on the geometric features of the target object and the texture features of the target object; optimizing the geometric characteristics of the target object and the texture characteristics of the target object based on N reconstructed pictures of the target object and differences among the N object pictures, and constructing a three-dimensional model of the target object based on the geometric characteristics of the target object when the geometric characteristics of the target object are optimized; in the scheme, the geometric features are represented based on the undirected distance field, and the undirected distance field can accurately represent the geometric structure of the object with the closed boundary and the geometric structure of the object with the open boundary, so that the method is suitable for modeling of the object with the open boundary and the object without the open boundary, expands the application range of modeling of the three-dimensional object, and improves the modeling effect of the three-dimensional object.

According to the scheme disclosed by the embodiment of the application, the reconstruction of any topological object can be realized based on the implicit neural expression, and the accurate geometric structure of the object can be automatically reconstructed through inputting the photographed multi-view pictures only by photographing the object in multiple views (such as 360-degree multi-view photographing), so that a complicated manual modeling process is omitted.

Referring to fig. 3, a frame diagram of a three-dimensional model reconstruction according to the present application is shown. As shown in fig. 3, first, image acquisition of different angles is performed on a target object 31 by a camera, and N object pictures 32 are obtained.

The computer device then performs feature extraction on the N object pictures 32 resulting in an undirected distance field 33 describing the geometry of the target object and a color radiation field 34 describing the texture of the target object.

The computer device then performs a picture generation based on the undirected distance field 33 and the color radiation field 34, resulting in N reconstructed pictures 35.

The computer device performs iterative optimization on the undirected distance field 33 and the color radiation field 34 according to the differences between the N object pictures 32 and the N reconstructed pictures 35, and updates the N reconstructed pictures 35 after each optimization until the iterative optimization is completed.

After the iterative optimization is completed, the computer device creates a three-dimensional model 36 of the target object 31 from the undirected distance field 33 after the last iterative optimization.

Fig. 4 shows a flowchart of a three-dimensional model acquisition method according to an exemplary embodiment of the present application. The method may be performed by a computer device. That is, in the embodiment shown in fig. 2, steps 220 to 240 may be implemented as steps 220a to 240a.

Step 220a: inputting the N object pictures into the feature extraction branches in the picture reconstruction model to obtain the geometric features of the target object and the texture features of the target object, which are output by the feature extraction branches of the picture reconstruction model.

For example, taking the above-mentioned geometric features characterized based on an undirected distance field as an example, the solution shown in the embodiments of the present application first characterizes the target object to be reconstructed as an undirected distance field and a neural radiation field with respect to texture, both fields using a neural network as an expression medium. Referring to fig. 5, a schematic diagram of a model structure according to the present application is shown. As shown in fig. 5, the feature extraction branches 510 in the model include geometric feature extraction branches 510a and texture feature extraction branches 510b.

The geometric feature extraction branches 510a and the texture feature extraction branches 510b are respectively used for performing geometric and texture characterization on the target object according to the input N object pictures 530.

Optionally, the geometric feature extraction branch 510a is a machine learning network (such as an MLP network), where a plurality of network weight parameters are input as picture data and output as an undirected distance field.

Optionally, the texture feature extraction branch 510b is also a machine learning network (such as an MLP network), where a plurality of network weight parameters are input as picture data and output as a color radiation field.

Wherein the geometrical feature extraction branch 510a may be implemented as a multi-layer perceptron (UDF-MLP) for outputting an undirected distance field and the texture feature extraction branch 510b may be implemented as a multi-layer perceptron (Color-MLP) for outputting a Color radiation field.

Unlike conventional methods that use explicit point clouds or grids as a geometric representation of the object to be reconstructed, the present solution uses an undirected distance field (UDF-MLP) to represent the geometry of the object to be reconstructed. The traditional method adopts an explicit picture to express the texture information of the object, and the scheme adopts an implicit Color radiation field (Color-MLP) to express the texture information of the object, wherein the two fields are expressed by using a fully-connected neural network. The method aims at optimizing the UDF-MLP and the Color-MLP to conform to the geometric and texture characteristics of an object to be reconstructed through iterative optimization, calculating the difference value between the colors of the input view and the synthesized view pixel by pixel through input multi-view, and minimizing the difference value.

Wherein both the above-mentioned undirected distance field and the color radiation field can be regarded as radiation fields.

Wherein the radiation field can be regarded as a function: if a ray is emitted from an angle into a static space, the probability density of each point in space for the ray, and the color that the position exhibits under the ray angle, can be queried. Wherein the probability density is used to calculate the weight and the pixel color is rendered by weighted summing the colors on the points. The scheme shown in the embodiments of the present application models Red-Green-Blue (RGB) values and probability densities through neural networks.

The above-mentioned execution of step 220a is similar to the process of extracting the directed distance field and the color radiation field of the object in the neural radiation field (Neural Radiance Fields, neRF) algorithm or the NeRF-like algorithm, and will not be repeated here.

Step 230a: and generating N reconstructed pictures based on the undirected distance field corresponding to the geometric characteristics of the target object and the texture characteristics of the target object through a volume rendering branch in the picture reconstruction model.

The scheme shown in the embodiment of the application synthesizes a new view through nerve rendering; taking the above fig. 5 as an example, the volume rendering branch 520 iteratively optimizes UDF-MLP and Color-MLP using a neural rendering approach. The geometrical and texture information is extracted from UDF-MLP and Color-MLP by the nerve rendering formula based on UDF provided by the algorithm, and a new view is synthesized.

Step 240a: acquiring a loss function value based on N reconstructed pictures of the target object and differences among the N object pictures; updating model parameters of the picture reconstruction model based on the loss function value; and inputting the N object pictures into the picture reconstruction model with updated parameters, and obtaining new geometric characteristics of the target object and new texture characteristics of the target object, which are output by the characteristic extraction branches of the picture reconstruction model.

The updating of the model parameters of the image reconstruction model may refer to updating the model parameters of the feature extraction branch in the image reconstruction model, or may refer to updating the model parameters of the feature extraction branch and the volume rendering branch in the image reconstruction model.

In one possible implementation, the above steps 220a to 240a are steps performed iteratively, that is, the computer device may further perform the following procedure through the volume rendering branch in the picture reconstruction model after the step 240a:

generating N new reconstructed pictures based on the undirected distance field corresponding to the new geometric features of the target object and the new texture features of the target object;

determining that the geometric feature optimization of the target object is completed in response to the differences between the N new reconstructed pictures and the N object pictures meeting the convergence condition;

And responding to the difference between the N new reconstructed pictures and the N object pictures not meeting the convergence condition, acquiring new loss function values based on the N new reconstructed pictures of the target object and the difference between the N object pictures, and updating model parameters of the picture reconstruction model based on the new loss function values.

Wherein, the above convergence condition may include: and the difference between the corresponding two pictures in the N new reconstructed pictures and the N object pictures is smaller than a first picture difference threshold value. Alternatively, the above convergence condition may be set to other conditions, which are not limited in the embodiment of the present application.

In the embodiment of the application, the computer equipment can perform iterative optimization on the undirected distance field and the color radiation field of the target object by continuously performing iterative updating on the parameters of the picture reconstruction model, so that a scheme for optimizing the undirected distance field and the color radiation field of the target object by using the machine learning model is provided, the undirected distance field of the target object gradually approximates to the actual geometric structure of the target object, and the accuracy of geometric structure optimization of the target object is ensured.

In one possible implementation manner, the volume rendering formula used by the volume rendering branch includes a density function; the density function is used to translate the geometric features of the target object into a probability density.

Alternatively, the workflow of the above-mentioned volume rendering branch may be as follows:

the three-dimensional coordinate X is first input into an 8-layer fully-connected network (for example, with ReLU as an active layer, each layer has 256 channels), and the probability density is output

(also referred to as bulk density) and a 256-dimensional feature vector, which is then input in series with the viewing direction d (which may include ray position and angle, which may be denoted as θ and Φ) into another fully connected network (e.g., 128 channels per layer by ReLU as an active layer), RGB colors depending on the viewing angle are output, which are then synthesized into a reconstructed picture by the volume rendering technique. />

That is, the algorithm provided by the embodiment of the application proposes a novel neural volume rendering equation based on an undirected distance field. When a ray r is emitted from the center of the camera and has a direction v, and the color corresponding to the ray is wanted to be synthesized, the color corresponding to the ray can be rendered by using the following volume rendering formula:

based on the above volume rendering formula, when the computer device reconstructs a picture, a plurality of sampling points are sampled on the light, the sampling points can be expressed as r (t), each point outputs a probability density sigma (density) through UDF-MLP, a Color C is output through Color-MLP, and the probability density and the Color of all the sampling points are integrated to obtain a Color value C (t) corresponding to the light. Where T (T) represents transparency, describing the probability that a certain sample point is occluded.

In the embodiment of the application, the undirected distance field and the color radiation field are fused through the volume rendering formula, so that N reconstructed pictures corresponding to N object pictures one by one are reconstructed, and in the process, the undirected distance field and the color radiation field are correctly fused in a mode of converting the undirected distance field into probability density, so that the accuracy of picture reconstruction is ensured, and the accuracy of the subsequent geometric structure optimization of the target object is further ensured.

In one possible implementation, in response to the geometric feature being characterized based on an undirected distance field, a density function is used to convert undirected distance values in the geometric feature to probability densities via a differentiable indicator function.

The proposal shown in the embodiment of the application designs a novel mode to lead the undirected distance value of a sampling point on the light to be

Conversion to probability Density->

The distance values of the undirected distance fields are mapped to probability values by means of a probability model.

Where k is an optimizable hyper-parameter describing the degree to which the undirected distance values are converted to probability densities.

The included angle between the gradient direction of the undirected distance field and the direction of the light is the sampling point.

Wherein, the liquid crystal display device comprises a liquid crystal display device,

the embodiment of the application defines the function as an indication function (indicator function), and according to the everywhere-where-derivable property of a volume rendering equation, in order to overcome the problem that an Undirected Distance Field (UDF) is not derivable at a zero value domain, the algorithm proposes a differentiable indication function (indicator function) form through a probability model:

represents the i-th sample point on ray r, < >>

Represents the j-th sample point on ray r, < >>

Is the distance between the i-th sampling point and the i+1-th sampling point, alpha is the hyper-parameter, +.>

Is an optimizable hyper-parameter describing the extent to which the ith sample point is occluded. />

Is the direct included angle between the gradient direction of the undirected distance field at the j+1th sampling point and the light ray direction. />

In the embodiment of the application, for the defect that the undirected distance field is not conductive at the zero value domain, the embodiment of the application realizes the volume rendering based on the undirected distance field and the color radiation field through the differentiable indication function, ensures the accuracy of the reconstructed picture, and further ensures the accuracy of the subsequent geometric structure optimization of the target object.

the first loss function value is a function value obtained based on color differences between the N reconstructed pictures and the N object pictures, between the corresponding pixels;

the second loss function value is a function value obtained based on a color difference between the corresponding patches between the N reconstructed pictures and the N object pictures.

The computer equipment calculates color differences between pixels corresponding to positions in the reconstructed pictures and the object pictures in N reconstructed pictures and N object pictures, obtains the color differences between the pixels corresponding to the positions in the reconstructed pictures and the object pictures, and then calculates first loss function values based on the color differences between the pixels corresponding to the positions in the N reconstructed pictures and the N object pictures, wherein the first loss function values can reflect differences between the N reconstructed pictures and the N object pictures on a pixel level.

Similarly, the computer device calculates, for a corresponding one of the N reconstructed pictures and the N object pictures, a color difference between the reconstructed pictures and the corresponding patches in the object pictures (for example, a difference between average color values of the patches) to obtain the color difference between the patches corresponding to the positions in the N reconstructed pictures and the N object pictures, and then calculates a second loss function value based on the color difference between the patches corresponding to the positions in the N reconstructed pictures and the N object pictures, where the second loss function value may reflect a difference between the N reconstructed pictures and the N object pictures on a patch level.

In the embodiment of the application, the computer equipment can update parameters of the picture reconstruction model by reconstructing the color difference between the picture and the object picture, so that the updated picture reconstruction model can output a reconstruction picture which is closer to the object picture by optimizing the undirected distance field and the color radiation field, thereby ensuring the accuracy of optimization of the undirected distance field.

In one possible implementation, the loss function value further includes at least one of a third loss function value, a fourth loss function value, and a fifth loss function value in response to the geometric feature being characterized based on the undirected distance field:

The third loss function value is used for optimizing the mode length of the undirected distance field;

In the embodiment of the application, the computer equipment can avoid excessive modification of model parameters by controlling the model length, the complexity and the contour of the undirected distance field in the model optimization process, so that the accuracy of parameter optimization in each iteration process is ensured, the required iteration times are shortened, and the iteration efficiency and accuracy are improved.

In one possible implementation, the third loss function value is calculated based on the difference between the model length of the undirected distance field and 1.

In the embodiment of the application, in the process of optimizing the picture reconstruction model, the computer equipment controls the mode length of the undirected distance field to be as close to 1 as possible, so that the mode length of the undirected distance field in the model optimization process is controllable, and the iteration efficiency and accuracy are improved.

In the embodiment of the application, in the process of optimizing the picture reconstruction model, the computer equipment controls the complexity of the undirected distance field by excluding the sampling points of the non-object surface in the undirected distance field, so that the complexity of the undirected distance field in the model optimization process is controllable, and the iteration efficiency and accuracy are improved.

In the embodiment of the application, in the process of optimizing the picture reconstruction model, the computer equipment controls the contour of the undirected distance field to approach the contour of the target object by minimizing the difference between the contour of the undirected distance field and the contour of the target object, so that the controllability of the contour of the undirected distance field in the model optimization process is realized, and the iteration efficiency and accuracy are improved.

In one possible implementation manner, the outline of the target object may be obtained by performing outline labeling on the target object in the target picture.

In the embodiment of the application, the energy function design and error calculation parts are as follows:

taking fig. 4 as an example, the algorithm related to the scheme can optimize the UDF-MLP and the Color-MLP in an iterative optimization mode. The difference between the synthesized new view (i.e. reconstructed picture) and the input view (i.e. object picture) is calculated as the main energy function. The optimization is iterated step by minimizing the energy function so that the undirected distance field (UDF-MLP) and the textured radiation field (Color-MLP) conform to the geometry and texture characteristics of the object to be reconstructed. And simultaneously, a plurality of regular terms are used as auxiliary terms of an energy function (namely the loss function), so that the optimization process is more stable. The energy function proposed by the algorithm is as follows:

L=L _color +λ ₀ L _patch +λ ₁ L _eik +λ ₂ L _reg （+γL _mask ）

First item L _color The color value and the color value of a certain pixel point on the synthesized view are calculated on the pixel level (pixel-level)Error between color values of corresponding pixels of the input view, second term L _patch Calculating the error between the color values of all pixels of a small area on the synthesized view and the color values of all pixels of a corresponding area on the input view on a patch-level (patch-level), and a third item L _eik The fourth term L is the error between the normal mode length of the undirected distance field and the unit mode length 1 _reg The method encourages the undirected distance field to be simple in form and avoids generating a sub-optimal solution. Wherein:

where τ is a constant scalar for scaling the UDF value and M is the number of sampled rays per optimization step. This formula effectively excludes non-surface sampling points on the UDF field from being zero, thus encouraging the undirected distance field to have a compact structure. K represents the kth ray and i represents the ith sample point on the ray.

L _mask The difference between the contour of the undirected distance field and the contour of the input object is calculated.

In embodiments of the present application, the computer device may use the adam optimizer to optimize the UDF-MLP and Color-MLP by calculating the error between the composite image and the input real view (and other helper errors).

In one possible implementation, the computer device may also select a picture reconstruction model matching the N object pictures from the plurality of candidate picture reconstruction models before performing step 220 a. The candidate picture reconstruction model is a reconstruction model when iteration optimization is completed in the process of optimizing an undirected distance field and a color radiation field of a sample object. The process of optimizing the undirected distance field and the color radiation field of the sample object is similar to the iterative optimization process from step 220a to step 240a, and the number of times is not repeated.

Because the three-dimensional reconstruction process of other objects has a certain guiding significance for the three-dimensional reconstruction process of the target object, particularly, in the three-dimensional reconstruction process of different objects, the parameter optimization process of the reconstruction model usually has a certain commonality, and is reflected in the model updating result, when the undirected distance field and the color radiation field of one object A are used for iterative optimization to the converged reconstruction model a, the undirected distance field and the color radiation field of the other object B are directly extracted, a relatively accurate result can be obtained, therefore, if the reconstruction model a is used as an initialized picture reconstruction model in the three-dimensional reconstruction process of the object B, and the reconstruction model a is matched with the target object, the iteration step of the three-dimensional reconstruction and the required picture number can be greatly shortened, and the effect on the three-dimensional reconstruction process is greatly improved.

In one possible implementation manner, the object classification may be classified according to whether the object is an open boundary object, for example, a candidate image reconstruction model corresponding to an open boundary object and a candidate image reconstruction model corresponding to an open non-boundary open object are stored in advance in the computer device; after the computer equipment acquires the N object pictures, whether the target object is an open boundary object or not can be identified from the N object pictures, and if the identification result is that the target object is the open boundary object, a candidate picture reconstruction model corresponding to the open boundary object is determined to be used as a picture reconstruction model for carrying out three-dimensional reconstruction on the target object; otherwise, if the identification result is that the target object is a non-open boundary object, determining a candidate picture reconstruction model corresponding to the non-open boundary object as a picture reconstruction model used for three-dimensional reconstruction of the target object.

In another possible implementation manner, the object classification may be classified according to the types of objects, for example, candidate image reconstruction models corresponding to the types of objects such as characters, animals, topography, buildings, etc. are stored in the computer device in advance; after the computer equipment acquires the N object pictures, the object types of the target object can be identified from the N object pictures, and a candidate picture reconstruction model corresponding to the identified object types is determined to be used as a picture reconstruction model for three-dimensional reconstruction of the target object.

In this embodiment of the present application, a plurality of candidate image reconstruction models corresponding to object classifications may be stored in advance in the computer device, and when a picture reconstruction model matched with N object images is selected from the plurality of candidate image reconstruction models, the computer device may determine, from the plurality of candidate image reconstruction models, a candidate image reconstruction model corresponding to an object classification of the target object as the above-mentioned image reconstruction model used for three-dimensional reconstruction of the target object. Specifically, if the object a and the object B belong to the same object classification, when the reconstruction model a is used to directly extract the undirected distance field and the color radiation field of the other object B, a relatively accurate result may be obtained with a higher probability, so that the efficiency of the three-dimensional reconstruction process can be improved.

Fig. 6 shows a flowchart of a three-dimensional model acquisition method according to an exemplary embodiment of the present application. The method may be performed by a computer device. That is, in the embodiment shown in fig. 2, step 250 may be implemented as step 250a.

Step 250a: in response to the optimization of the geometric features of the target object being completed, an explicit mesh surface of the target object is extracted from the geometric features of the target object.

In the embodiment of the application, after the optimization process converges, the computer device may use a mesh extraction algorithm such as a mesh-UDF algorithm to extract a displayed mesh surface (mesh) from the optimized undirected distance field, so as to obtain model data of the three-dimensional model reconstructed by the target object.

The mesh-UDF algorithm directly grids the depth UDF into an open surface with a traveling cube extension by locally detecting surface intersections.

The technology can directly model a three-dimensional model with high quality from a multi-view picture which is actually shot, wherein the three-dimensional model comprises an object with a closed surface and an object with an open boundary, the reconstructed object can be applied to a digital content model in a movie in a game, the complicated manual modeling step is omitted, and the manufacturing threshold of the three-dimensional model is reduced. Meanwhile, the three-dimensional content modeling with low cost can save the manufacturing cost of digital content in games and movies and improve the manufacturing efficiency of the digital content. The existing modeling method can only model closed surface objects, but can not model open boundary objects, such as character clothes and the like, so that the intelligent digital content generation range is greatly limited.

In order to model an object in any topological form, the solution shown in the above embodiments of the present application uses an undirected distance field to represent the geometry of the object. The sign of the directed distance field requires dividing the space into an interior and an exterior with respect to the directed distance field, but the open-boundary object has no interior and exterior division, so the directed distance field cannot model the open object. Whereas the undirected distance field does not need to divide the interior and exterior of the space, so that open boundary objects can be modeled.

The scheme shown in the embodiment of the application can be used for producing three-dimensional digital content in games and movies. For example, the device can be integrated into modeling software in the form of a plug-in, and the geometric structure of a photographed object can be automatically reconstructed by inputting a series of multi-view pictures.

Taking the example of three-dimensional digital content production in a game, please refer to fig. 7, which shows a frame diagram of three-dimensional model reconstruction according to the present application. In the process of producing three-dimensional digital content for a game, as shown in fig. 7, first, image collection of different angles is performed on a target object 71 to be modeled by a camera, N object pictures 72 are obtained, and the N object pictures 72 are input into a computer device.

Then, the computer device performs object classification recognition on the N object pictures 72, obtains an object classification 73, and matches and selects a picture reconstruction model 75 corresponding to the object classification 73 from a plurality of candidate picture reconstruction models 74.

The computer device then performs feature extraction on the N object pictures 72 through feature extraction branches 75a in the picture reconstruction model 75, resulting in an undirected distance field 76 describing the geometry of the target object, and a color radiation field 77 describing the texture of the target object.

The computer device then performs picture generation based on the undirected distance field 76 and the color radiation field 77 through the volume rendering branch 75b in the reconstruction model 75, resulting in N reconstructed pictures 78.

The computer device calculates the loss function value 79 from the differences (optionally in combination with a plurality of auxiliary items) between the N object pictures 72 and the N reconstructed pictures 78, performs iterative optimization of parameters on the reconstructed model 75 by the loss function value 79, and regenerates new N reconstructed pictures 78 by the reconstructed model 75 after each optimization until the iterative optimization is completed.

After the iterative optimization is completed, the computer device obtains model data (such as mesh data) of the three-dimensional model 710 of the target object 71 through a mesh-UDP algorithm according to the undirected distance field 76 after the last iterative optimization, and can subsequently develop an object model of the virtual object in the game through the model data.

Specifically, please refer to fig. 8, 9 and 10. Wherein, FIG. 8 is a schematic diagram of modeling results of a conventional modeling scheme; FIG. 9 is a schematic representation of modeling results of an open-boundary object based on an undirected distance field in accordance with the present application; FIG. 10 is a schematic representation of modeling results of a closed boundary object (i.e., a non-open boundary object) based on an undirected distance field, to which the present application relates.

As can be seen by comparing fig. 8 to 10, the modeling scheme according to the present application can be applied to the accurate modeling of open-boundary objects and non-open-boundary objects.

Meanwhile, the algorithm shown in the embodiment of the application can further learn prior information from large-scale data to serve as a reconstruction guide, so that the number of input images is reduced, the reconstruction speed is increased, and the modeling efficiency is further improved.

FIG. 11 illustrates a block diagram of a three-dimensional model acquisition device that may be used to perform all or some of the steps of the methods illustrated in FIG. 2, FIG. 4, or FIG. 6, according to an exemplary embodiment of the present application; as shown in fig. 11, the apparatus may include the following modules.

A picture obtaining module 1101, configured to obtain N object pictures; the N object pictures are pictures obtained by carrying out image acquisition on the same target object at different angles; n is an integer greater than or equal to 2;

a feature extraction module 1102, configured to obtain geometric features of the target object and texture features of the target object based on the N object pictures;

a picture reconstructing module 1103, configured to generate N reconstructed pictures of the target object based on geometric features of the target object and texture features of the target object;

An optimization module 1104, configured to iteratively optimize geometric features of the target object and texture features of the target object based on N reconstructed pictures of the target object and differences between the N object pictures;

a model building module 1105, configured to build a three-dimensional model of the target object based on the geometric features of the target object in response to the optimization of the geometric features of the target object being completed.

In a possible implementation manner, the feature extraction module 1102 is configured to input the N object pictures into feature extraction branches in a picture reconstruction model, and obtain geometric features of the target object and texture features of the target object output by the feature extraction branches of the picture reconstruction model;

the image reconstruction module 1103 is configured to generate, through a volume rendering branch in the image reconstruction model, the N reconstructed images based on an undirected distance field corresponding to the geometric feature of the target object and the texture feature of the target object;

the optimization module 1104, for,

In a possible implementation manner, the picture reconstruction module 1103 is further configured to generate, through a volume rendering branch in the picture reconstruction model, N new reconstructed pictures based on an undirected distance field corresponding to a new geometric feature of the target object and a new texture feature of the target object;

the apparatus further comprises:

the optimizing module 1104 is further configured to determine that the geometric feature optimization of the target object is completed in response to the differences between the N new reconstructed pictures and the N object pictures satisfying a convergence condition;

the optimizing module 1104 is further configured to obtain a new loss function value based on the N new reconstructed pictures of the target object and the differences between the N object pictures, and update model parameters of the image reconstruction model based on the new loss function value, in response to the N new reconstructed pictures and the differences between the N object pictures not satisfying the convergence condition.

In one possible implementation, the model building module 1105 is configured to extract an explicit mesh surface of the target object from the geometric features of the target object in response to optimization of the geometric features of the target object being completed.

Fig. 12 shows a block diagram of a computer device 1200 shown in an exemplary embodiment of the present application. The computer device may be implemented as a server or a terminal in the above-described schemes of the present application. The computer apparatus 1200 includes a central processing unit (Central Processing Unit, CPU) 1201, a system Memory 1204 including a random access Memory (Random Access Memory, RAM) 1202 and a Read-Only Memory (ROM) 1203, and a system bus 1205 connecting the system Memory 1204 and the central processing unit 1201. The computer device 1200 also includes a mass storage device 1206 for storing an operating system 1209, application programs 1210, and other program modules 1211.

The mass storage device 1206 is connected to the central processing unit 1201 through a mass storage controller (not shown) connected to the system bus 1205. The mass storage device 1206 and its associated computer-readable media provide non-volatile storage for the computer device 1200. That is, the mass storage device 1206 may include a computer readable medium (not shown) such as a hard disk or a compact disk-Only (CD-ROM) drive.

Without loss of generality, the computer readable medium may include computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, erasable programmable read-Only register (Erasable Programmable Read Only Memory, EPROM), electrically erasable programmable read-Only Memory (EEPROM) flash Memory or other solid state Memory technology, CD-ROM, digital versatile disks (Digital Versatile Disc, DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will recognize that the computer storage medium is not limited to the ones described above. The system memory 1204 and mass storage device 1206 described above may be collectively referred to as memory.

According to various embodiments of the disclosure, the computer device 1200 may also operate through a network, such as the Internet, to a remote computer on the network. I.e., the computer device 1200 may be connected to the network 1208 via a network interface unit 1207 coupled to the system bus 1205, or alternatively, the network interface unit 1207 may be used to connect to other types of networks or remote computer systems (not shown).

The memory further includes at least one computer instruction stored in the memory, and the central processing unit 1201 implements all or part of the steps of the methods shown in the various embodiments described above by executing the at least one computer instruction.

In an exemplary embodiment, a computer readable storage medium is also provided for storing at least one computer instruction that is loaded and executed by a processor to implement all or part of the steps in the three-dimensional model acquisition method described above. For example, the computer readable storage medium may be Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), compact disc Read-Only Memory (CD-ROM), magnetic tape, floppy disk, optical data storage device, and the like.

In an exemplary embodiment, a computer program product or a computer program is also provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs all or part of the steps in the three-dimensional model acquisition method described above.

Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.

It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof.

Claims

1. A method for obtaining a three-dimensional model, the method comprising:

2. The method of claim 1, wherein the generating N reconstructed pictures of the target object based on the geometric features of the target object and the texture features of the target object comprises:

generating N reconstructed pictures of the target object based on the undirected distance field corresponding to the geometric features of the target object and the texture features of the target object.

3. The method of claim 2, wherein the acquiring geometric features of the target object and texture features of the target object based on the N object pictures comprises:

inputting the N object pictures into a feature extraction branch in a picture reconstruction model to obtain geometric features of the target object and texture features of the target object, wherein the geometric features are output by the feature extraction branch of the picture reconstruction model;

The generating N reconstructed pictures of the target object based on the undirected distance field corresponding to the geometric feature of the target object and the texture feature of the target object includes:

generating the N reconstructed pictures based on the undirected distance field corresponding to the geometric features of the target object and the texture features of the target object through the volume rendering branches in the picture reconstruction model;

the iterative optimization of the geometric features of the target object and the texture features of the target object based on the N reconstructed pictures of the target object and the differences among the N object pictures includes:

4. A method according to claim 3, characterized in that the method further comprises:

Generating N new reconstructed pictures based on undirected distance fields corresponding to the new geometric features of the target object and the new texture features of the target object through a volume rendering branch in the picture reconstruction model;

determining that the geometrical feature optimization of the target object is completed in response to the N new reconstructed pictures and the differences among the N object pictures meeting convergence conditions;

5. A method according to claim 3, wherein the volume rendering branch uses a volume rendering formula comprising a density function;

6. The method according to claim 5, wherein the density function is used to convert undirected distance values in the geometric feature to probability densities by a differentiable indicator function.

7. The method of claim 3, wherein the loss function value comprises at least one of a first loss function value and a second loss function value:

8. The method of claim 7, wherein the loss function values further comprise at least one of a third loss function value, a fourth loss function value, and a fifth loss function value:

9. The method of claim 8, wherein the third loss function value is calculated based on a difference between a mode length of the undirected distance field and 1.

10. The method of claim 8, wherein the fourth loss function value is used to exclude sample points of non-object surfaces in the undirected distance field.

11. The method of claim 8, wherein the fifth loss function value is calculated based on a difference between the contour of the undirected distance field and the contour of the target object.

12. A three-dimensional model acquisition apparatus, characterized in that the apparatus comprises:

13. A computer device comprising a processor and a memory storing at least one computer instruction that is loaded and executed by the processor to implement the three-dimensional model acquisition method of any one of claims 1 to 11.

14. A computer readable storage medium having stored therein at least one computer instruction that is loaded and executed by a processor to implement the three-dimensional model acquisition method of any one of claims 1 to 11.