CN118135098A - Method and device for generating three-dimensional shape and computer equipment - Google Patents

Method and device for generating three-dimensional shape and computer equipment Download PDF

Info

Publication number
CN118135098A
CN118135098A CN202410129429.4A CN202410129429A CN118135098A CN 118135098 A CN118135098 A CN 118135098A CN 202410129429 A CN202410129429 A CN 202410129429A CN 118135098 A CN118135098 A CN 118135098A
Authority
CN
China
Prior art keywords
dimensional
radiation field
neural
component
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410129429.4A
Other languages
Chinese (zh)
Inventor
邓文平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Shibite Robot Co Ltd
Original Assignee
Hunan Shibite Robot Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Shibite Robot Co Ltd filed Critical Hunan Shibite Robot Co Ltd
Priority to CN202410129429.4A priority Critical patent/CN118135098A/en
Publication of CN118135098A publication Critical patent/CN118135098A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Generation (AREA)

Abstract

The application provides a method for generating a three-dimensional shape, a device for generating the three-dimensional shape and computer equipment, and relates to the technical field of computers, wherein the method for generating the three-dimensional shape in one embodiment comprises the following steps: acquiring a graph neural network, wherein nodes of the graph neural network comprise 3D-shaped components, and the nodes are connected through edges; the node attributes of the nodes include: a neural radiation field model of the component, the edge attributes of the edge comprising: the relationship between two parts corresponding to the two nodes connected by the edge; acquiring a two-dimensional image, and mapping the two-dimensional image to a three-dimensional image implicit space; and implicitly generating a part structure of each part of the three-dimensional shape in space for the three-dimensional image through the graph neural network. According to the scheme of the embodiment, the local part for generating the three-dimensional shape based on the two-dimensional image is realized, and the 3D shape is conveniently expressed.

Description

Method and device for generating three-dimensional shape and computer equipment
Technical Field
The present application relates to the field of computer technology, and in particular, to a method of generating a three-dimensional shape, an apparatus for generating a three-dimensional shape, a computer device, a computer readable storage medium, and a computer program product.
Background
The neural radiation field (Neural RADIANCE FIELD, NERF) is an image-based three-dimensional modeling method, which is a radiation field model based on a neural network, and a high-quality 3D (three-dimensional) shape is generated by inputting a plurality of pairs of 2D (two-dimensional) images with different visual angles, and the color and the density of any given point in a three-dimensional space can be output on the basis, so that a realistic image can be rendered from any other visual angle.
However, the final output of the existing neural radiation field is a rendered two-dimensional image, and for a three-dimensional object, only an implicit representation of the three-dimensional object can be supported, and the implicit voxel representation cannot describe the structural information of the three-dimensional object.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method of generating a three-dimensional shape, an apparatus for generating a three-dimensional shape, a computer device, a computer-readable storage medium, and a computer program product that can express a 3D shape in a convenient manner.
In a first aspect, the present application provides a method of generating a three-dimensional shape, wherein the method comprises:
Acquiring a graph neural network, wherein nodes of the graph neural network comprise 3D-shaped components, and the nodes are connected through edges; the node attributes of the nodes include: a neural radiation field model of the component, the edge attributes of the edge comprising: the relationship between two parts corresponding to the two nodes connected by the edge;
Acquiring a two-dimensional image, and mapping the two-dimensional image to a three-dimensional image implicit space;
and implicitly generating a part structure of each part of the three-dimensional shape in space for the three-dimensional image through the graph neural network.
In a second aspect, the present application also provides an apparatus for generating a three-dimensional shape, wherein the apparatus comprises:
The image neural network acquisition module is used for acquiring an image neural network, wherein nodes of the image neural network comprise 3D-shaped components, and the nodes are connected through edges; the node attribute of each node includes: a neural radiation field model of the component, the edge attributes of the edge comprising: the relationship between two parts corresponding to the two nodes connected by the edge;
The mapping module is used for acquiring a two-dimensional image and mapping the two-dimensional image to a three-dimensional image implicit space;
And the structure generation module is used for generating the component structure of each component of the three-dimensional shape in the implicit space of the three-dimensional image through the graph neural network.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the method of any of the embodiments described above when the computer program is executed.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the method in any of the embodiments described above.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of any of the embodiments described above.
The method, the device, the computer equipment, the computer readable storage medium and the computer program product for generating the three-dimensional shape are based on the embodiment of the application, the three-dimensional shape generating method, the computer equipment, the computer readable storage medium and the computer program product, the three-dimensional shape is characterized by comprising the combination of the nerve radiation field models of all the components, the components are used as nodes in the graph neural network, the nerve radiation field models of the components are used as node attributes of the nodes in the graph neural network, the relationships among the components are used as edge attributes of edges in the graph neural network, thus, when the local components of the 3D shape are required to be generated and displayed, after the two-dimensional image is obtained, the two-dimensional image can be mapped to the three-dimensional image implicit space of the nerve radiation field model, and the nerve radiation field models of all the components are used as node attributes of the nodes in the graph neural network, thereby being used as control conditions to control the graph neural network to generate the component structure of the components of the three-dimensional shape, thereby realizing the generation of the local components of the three-dimensional shape based on the two-dimensional image, realizing the convenient expression of the 3D shape, and the generation and editing of the local components of the 3D shape.
Drawings
FIG. 1 is a schematic illustration of an application scenario of a method of generating a three-dimensional shape in one embodiment;
FIG. 2 is a flow diagram of a method of generating a three-dimensional shape in one embodiment;
FIG. 3 is a flow chart of a training method of a neural radiation field model in one embodiment;
FIG. 4 is a schematic diagram of a structured neural radiation field in one embodiment;
FIG. 5 is a schematic diagram of the relationship between the neural radiation field and the graph neural network in one embodiment;
FIG. 6 is an exemplary diagram of editing a local part of a 3D shape based on a structured neural radiation field in one example;
FIG. 7 is an exemplary diagram of editing a local part of a 3D shape based on a structured neural radiation field in another example;
FIG. 8 is a block diagram of an apparatus for generating a three-dimensional shape in one embodiment;
fig. 9 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Embodiments of the technical scheme of the present application will be described in detail below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical aspects of the present application, and thus are merely examples, and are not intended to limit the scope of the present application.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the description of the drawings above are intended to cover a non-exclusive inclusion.
In the description of embodiments of the present application, the technical terms "first," "second," and the like are used merely to distinguish between different objects and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated, a particular order or a primary or secondary relationship. In the description of the embodiments of the present application, the meaning of "plurality" is two or more unless explicitly defined otherwise.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In the description of the embodiments of the present application, the term "plurality" means two or more (including two), and similarly, "plural sets" means two or more (including two), and "plural sheets" means two or more (including two).
As described above, the neural radiation field only supports implicit representation of the three-dimensional object, and such implicit voxel representation cannot characterize structural information of the three-dimensional object, so that structural rationality of geometric modeling cannot be ensured, and structural editing of the three-dimensional object cannot be supported. That is, in the conventional neural radiation field, the entire 3D shape is represented as one global neural radiation field, and thus, the different parts of the shape cannot be independently edited.
It has been found that structured neural radiation fields can be provided based on a combination of the neural radiation field and a graph neural network to enable local editing of 3D shapes. Based on the structured neural radiation field, the 3D shape can be decomposed into a plurality of components, and each component is represented as a locally defined neural radiation field, thereby enabling local editing of the 3D shape. For example, the components may be changed, components of different objects may be mixed, and the like. Through the mechanism, more flexibility and controllability can be provided for editing and generating the 3D shape, so that the method has wide application prospects in the fields of 3D shape generation, shape editing, computer graphics and the like. Meanwhile, by combining the graph neural network, taking the components as nodes in the graph neural network, taking the neural radiation field model of the components as node attributes of the nodes in the graph neural network and taking the relation among the components as edge attributes of edges in the graph neural network, the graph neural network can be controlled to generate the component structure of the components in the three-dimensional shape according to the node attributes, so that the generation of the local components in the three-dimensional shape based on the two-dimensional image is realized, the generation and editing of the local components in the 3D shape can be realized, and the structural information of the 3D shape can be depicted.
Based on this, embodiments of the present application provide a method of generating a three-dimensional shape. The method for generating the three-dimensional shape provided by the embodiment of the application can be applied to an application environment shown in fig. 1. The terminal 10 communicates with the server 20 through a network, the server 20 trains and obtains a structured neural radiation field model through training and learning, the structured neural radiation field model comprises a neural radiation field model and a graph neural network of each component in a 3D shape, the components are used as nodes in the graph neural network, the neural radiation field model of the components is used as node attributes of the nodes in the graph neural network, and the relationship among the components is used as edge attributes of edges in the graph neural network. The obtained structured neural radiation field model may be deployed and applied in the terminal 10, the server 20, or other servers. It is also possible that one of the terminals 10 obtains a structured neural radiation field model through learning, and the obtained structured neural radiation field model may be provided to the terminal 10, the server 20, or other server deployment and application. The terminal 10 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like. The portable wearable device may be a smart watch, smart bracelet, headset, or the like. The server 20 may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.
Referring to fig. 2, a method for generating a three-dimensional shape according to an embodiment of the present application will be described as applied to a server 20 or a terminal 10, where the method includes:
Step S201: acquiring a graph neural network, wherein nodes of the graph neural network comprise 3D-shaped components, and the nodes are connected through edges; the node attribute of each node includes: a neural radiation field model of the component, the edge attributes of the edge comprising: and the relation between the components corresponding to the nodes connected by the edges.
A Graph (Graph) is a data structure that can very naturally model complex relationships between a set of entities in a real scene. In the way of analyzing graph structures in an intelligent way, a graph neural network (Graph Neural Network, GNN) is a deep learning based graph modeling way. The graph is typically represented using g= (V, E), where V represents a set of nodes and E represents a set of edges. For two adjacent nodes u, V in a node join V, e= (u, V) is used to represent the edge between the two nodes. Edges between two nodes may or may not be oriented. If directed, it is referred to as a directed graph (DIRECTED GRAPH), whereas if directed, it is referred to as an undirected graph (Undirected Graph).
In the embodiment of the application, the 3D shape is divided into a plurality of parts, each part is used as a node in the graph neural network, a nerve radiation field model is built for each part of the 3D shape, the nerve radiation field model of the part is used as the attribute of the node in the graph neural network, and the relationship among the parts is used as the attribute of the edge in the graph neural network. The division manner of dividing the 3D shape into the plurality of parts is not limited, and may relate to the shape characteristics of the 3D shape itself, or may relate to the perspective characteristics based on the implementation of the 3D shape, which is not particularly limited in the embodiment of the present application.
The relationship between the respective components of the 3D shape is not limited, and may be, for example, a symmetrical relationship, a connection relationship, or the like.
While the relationship between the components is referred to as an edge attribute of an edge in the graph neural network, the edge attribute may also include information related to the relationship. In the case of the relationship being a symmetrical relationship, the side attribute may include a symmetrical relationship, a symmetrical center point or a symmetrical axis related to the symmetrical relationship, and the like, and in the case of the relationship being a connection relationship, the side attribute may include a connection relationship, a connection direction or a connection azimuth related to the connection relationship, and the like, but is not limited thereto.
Step S202: and acquiring a two-dimensional image, and mapping the two-dimensional image to a three-dimensional image implicit space.
The two-dimensional image is a two-dimensional planar image, which may be a randomly sampled two-dimensional image, such as a randomly sampled two-dimensional image based on implicit information of a graph neural network. In some embodiments, the two-dimensional image may also be an image input by the user, such as an image uploaded by the user terminal, or a two-dimensional image captured by invoking a camera of the user terminal.
The neural radiation field is used as a three-dimensional modeling mode based on images, and can learn a neural network based on a series of two-dimensional images with different visual angles, and the neural network can realize rendering of a real image from any visual angle. In an embodiment of the application, the 3D shape is divided into a plurality of parts, and a corresponding neural radiation field model is built for each part, thereby obtaining a set of neural radiation field models. The set of neuro-radiation field models may map the two-dimensional image to implicit space as it contains the neuro-radiation field models of the various components of the 3D shape.
The implicit image space may refer to implicit representation of a 3D shape in the neural radiation field model, and the manner of mapping the two-dimensional image to the implicit three-dimensional image space may be implemented based on the principle of the neural radiation field model, which is not particularly limited in the embodiment of the present application.
Step S203: and implicitly generating a part structure of each part of the three-dimensional shape in space for the three-dimensional image through the graph neural network.
Because the node attribute of each node in the graph neural network is the neural radiation field model of the component corresponding to the node of the three-dimensional component, and the neural radiation field model can learn the characteristic of the three-dimensional space from the two-dimensional image to obtain the implicit representation of the three-dimensional image, the component structure of each component can be generated for the implicit space of the three-dimensional image based on the graph neural network taking the neural radiation field model of each component as the node attribute, and the specific three-dimensional structure can be formed by combining the component structures of each component.
According to the method for generating the three-dimensional shape, the three-dimensional shape is characterized by the combination of the nerve radiation field models of the components, the components are used as nodes in the graph nerve network, the nerve radiation field model of the components is used as node attributes of the nodes in the graph nerve network, the relationship among the components is used as side attributes of the sides in the graph nerve network, therefore, when the local components of the 3-D shape are required to be generated and displayed, after the two-dimensional image is obtained, the two-dimensional image can be mapped to the three-dimensional image implicit space of the nerve radiation field model, the nerve radiation field model of the components is used as node attributes of the nodes in the graph nerve network, the component structure of the components of the three-dimensional shape is controlled by the graph nerve network as control conditions, the local components of the three-dimensional shape are generated based on the two-dimensional image, the 3-D shape is conveniently expressed, and accordingly, the local components of the 3-D shape can be generated and edited, and the structural information of the 3-D shape can be depicted.
In some embodiments, referring to FIG. 3, the manner in which the neural radiation field model of each component is obtained by training may include:
Step S301: a two-dimensional sample image is acquired.
The two-dimensional sample image refers to an image used to train the neural radiation field model, which is a two-dimensional image. The manner of acquiring the two-dimensional sample image is not limited, and may be, for example, a two-dimensional image input by a user. In some embodiments, the two-dimensional image may also be a two-dimensional image taken or projected or rendered from different angles for the 3D shape.
Step S302: each neural radiation field model learns an implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image, respectively.
The method for learning the implicit representation from the two-dimensional image to the three-dimensional image by the nerve radiation field model from the two-dimensional sample image is not limited, and the method can be the same as the learning method of the existing nerve radiation field model, and the embodiment of the application is not particularly limited.
Step S303: a two-dimensional rendered image is rendered based on a combination of the implicit representations of the neural radiation field models of the components.
Based on a combination of implicit representations of the neural radiation field models of the respective components, an implicit representation of the 3D shape formed by the respective components may be represented, whereby a two-dimensional rendered image may be rendered therefrom.
Step S304: a penalty between the two-dimensional sample image and the two-dimensional rendered image is calculated.
Based on the difference between the two-dimensional sample image and the two-dimensional rendering image, the difference between the two-dimensional rendering image rendered by the combination of the two-dimensional sample image and the nerve radiation field model of each component can be determined, so that whether the combination of the nerve radiation field model of each component can accurately learn the implicit representation from two dimensions to three dimensions can be determined.
In some embodiments, the calculated penalty may be a penalty obtained by calculating the difference of the two-dimensional rendered image from the input 2D image for each perspective.
The manner of calculating the loss function of the loss is not limited, and the embodiment of the application does not limit the specific type of the loss function.
Step S305: based on the losses, the neural radiation field model of each component is optimized.
Based on the calculated losses, it can be determined whether the combination of the neural radiation field models of the respective components can correctly learn the implicit representation from two dimensions to three dimensions, and thus the neural radiation field models of the respective components can be optimized accordingly. It will be appreciated that the manner in which the neural radiation field model of each component is optimized may be to optimize model parameters in each neural radiation field model, and the specific optimization manner is not limited.
It will be appreciated that after optimizing the neural radiation field model of each component, if training is not complete, the process may return to step S302 to continue learning the implicit representation from the two-dimensional image to the three-dimensional image.
Step S306: at the end of training, the neural radiation field model of each component optimized for the last time is taken as the neural radiation field model of each component obtained by training.
The method for determining the end of training is not limited, and may be, for example, that the number of training times reaches a preset number of times, or that the calculated loss reaches an acceptable loss range, or that the loss of a certain number of times is not significantly improved, for example, that the difference between the losses of a certain number of times is within a preset range.
In some embodiments, each neural radiation field model learns an implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image, respectively, and may include:
In each neural radiation field model, a ray is associated with the neural radiation field model of the first component through which the ray passes in learning an implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image, respectively.
Based on the principle of neural radiation field, which is to determine the position of each sampling point on the ray by multiplying the sampling distance along the ray (ray) direction, in the case of dividing the 3D shape into a plurality of parts, the ray direction may pass through the plurality of parts of the 3D shape, so that by associating the ray with the neural radiation field model of the first part through which it passes, the color of each ray is predicted from the neural radiation field model of one individual part, avoiding color mixing between different parts, and also ensuring that when one of the parts is edited, neither the shape nor the appearance of the other parts changes.
In some embodiments, the ray is associated with a neural radiation field model of a first component through which the ray passes, comprising:
and calculating the intersection points of all the rays and the components, grouping the rays intersected with the same component according to the proximity degree of the rays and the intersection points, and associating the grouping with the nerve radiation field model of the corresponding component.
The method of calculating the intersection point of the ray and each component is not limited, and any method may be used that can calculate the intersection point of the straight line and the region.
By grouping rays intersecting the same component, each ray group can be associated with a corresponding component of the 3D shape, thereby enabling the color of each ray to be predicted from the neural radiation field representing the relevant component. On this basis, it is made possible to realize local control and editing of 3D shapes without displaying 3D supervision, and by forcing one-to-one mapping between control rays and 3D shaped components, it is ensured that each component based on a neural radiation field can be independently controlled and edited.
In some embodiments, the method may further comprise:
And carrying out affine transformation on the nerve radiation field model of the corresponding part when a part editing instruction is received so as to carry out transformation operation of the part.
Affine transformation, also called affine mapping, is the process of performing a linear transformation on one vector space and translating the previous vector space into another vector space. Thus, the nerves of each component can be transformed locally using affine transformation to achieve transformation of the 3D shape. In a specific example, the structured neural radiation field may associate the neural radiation field of each component with one affine transformation, thereby enabling transformation operations for each component. Such affine transformations allow for a flexible editing of 3D shapes by the structured neural radiation field.
Wherein the type of operation of the transformation operation is not limited, and in some embodiments the transformation operation may include at least one of a translation operation, a rotation operation, and a scaling operation.
Based on embodiments of the present application, the concept of a structured neural radiation field is presented, which involves the combination of a neural radiation field (NeRF) with a graph neural network. In particular, structured neural radiation fields, as a combination of a neural network based generation model of neural radiation fields (NeRF), can generate structured, editable 3D shapes without explicit 3D supervision. Which breaks down the 3D shape into multiple parts and represents each part as a partially defined part NeRF, while generating the spatial relationship between the parts.
Referring to fig. 4, in the case of a structured neural radiation field provided according to an embodiment of the present application, each part of the 3D object has a local neural radiation field, where different neural radiation fields are represented by different filling patterns, and in practical technical applications, different neural radiation fields may be represented by different colors. Given a view angle, image rendering is performed by sending rays through neural radiation fields of different components to obtain color and density values, as shown in fig. 4 (1). In fig. 4 (2), a topology structure generated by the neural network is shown, where a solid line represents a connection relationship and a dotted line represents a symmetrical relationship, and it is understood that in practical technical application, different connection relationships may be represented by lines with different colors, for example, a red edge may be used to represent a connection relationship and a blue edge may be used to represent a symmetrical relationship.
It can be seen that based on structured neural radiation fields, it involves the following technical processing mechanisms:
The method comprises the following steps: local control mechanism. Since the structured neural radiation field decomposes the 3D shape into a plurality of components and represents each component as a locally defined component NeRF, each component can be controlled in conjunction with its corresponding component NERF separately, thereby implementing local control, and this local control mechanism enables the structured neural radiation field to implement local editing of the 3D shape, such as transforming the components, mixing the components of different objects, etc.
And two,: topology generation based on graph representation. The component structure of the three-dimensional object, i.e., the map generation model, is generated by employing a map neural network. And generating a graph representing the structure of the 3D object component through a graph generating model, wherein nodes of the graph are components of the 3D object, the attributes of edges of the graph represent the relationships among the components, and NeRF defining each component is taken as the node attribute of the node corresponding to the corresponding component. By using NeRF of each component as node attribute of the corresponding node of the corresponding component, neRF of the component can be used as control condition of the graph generation model to control the structure generation of the graph neural network based on the control condition, so that each component NeRF obtained based on the local control mechanism is consistent with the component structure obtained by the graph generation network.
In connection with the above, in practice, the neural radiation field of each component may be trained first to obtain the neural radiation field NeRF of each component in a three-dimensional shape.
The manner in which the neural radiation field model of each component is trained in some embodiments may be as follows:
The training mode of the nerve radiation field model comprises the following steps:
acquiring a two-dimensional sample image;
each nerve radiation field model respectively learns implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image information;
Rendering a two-dimensional rendered image based on a combination of the implicit representations of the neural radiation field models of the components,
Calculating a loss between the two-dimensional sample image information and the two-dimensional rendered image;
optimizing the neural radiation field model for each component based on the losses;
At the end of training, the neural radiation field model of each component optimized for the last time is taken as the neural radiation field model of each component obtained by training.
In the training process described above, an unsupervised learning mechanism may be employed to perform training to learn a representation of the 3D shape. An unsupervised learning mechanism uses a self-encoder to learn a representation of the 3D shape from images taken by known cameras and object masks. The self-encoder is a neural network, and consists of an encoder and a decoder. The encoder maps the input (i.e., the set of embedded vectors representing each training sample feature) to a low-dimensional potential space, and the decoder maps the potential space back to the original input space, i.e., the 3D shape.
In the training process described above for embodiments of the present application, the self-encoder may be optimized to minimize reconstruction errors between input and output, such that the self-encoder learns the basic features that encode 3D shapes in potential space, which may be used to generate new shapes or edit existing shapes. Because of the nerve radiation field or the structured nerve radiation field realized on the basis, explicit 3D supervision is not needed, and the availability of 2D images and object masks is utilized to learn the representation of the 3D shape, the method has more expandability and practicability by adopting an unsupervised learning mechanism, and is suitable for practical application. Moreover, the use of self-encoder training of the structured neural radiation field may simultaneously allow for learning a compact and meaningful representation of the 3D shape, which is very useful for various downstream tasks (e.g., shape retrieval, classification, and segmentation).
After the neural radiation field of each component is obtained through training, each component is used as a node of the graph neural network, the relation between the components is used as the attribute of the edge of the graph neural network, and the neural radiation field model of the component is used as the node attribute of the node of the component, so that the graph neural model is obtained, and the principle of the graph neural model is shown in fig. 5.
The resulting neural model may be deployed into a device, such as a terminal or server, that requires the generation, viewing, and editing of the 3D shaped local part. In the case of deployment to a server, the terminal may enable viewing and editing of the local parts of the 3D shape by accessing the server.
After the deployment of the graphic neural network is completed, the viewing and editing of the local parts of the 3D shape can be realized through the graphic neural network based on the input of the two-dimensional image, after the two-dimensional image is acquired, the two-dimensional image is mapped to the implicit space, and the part structures of the parts of the three-dimensional shape are generated for the implicit space through the graphic neural network.
In the process of checking and editing the component structures of all the components, the components can be edited by setting or changing the relation among nodes in the graph neural network through the explicit graph neural network. In some examples, schematic views of viewing and editing partial components of 3D shapes are shown in fig. 6 and 7, and in fig. 6, three 3D shapes on the left are represented based on a component neural radiation field, and a 3D shape represented by the component neural radiation field on the right is obtained through addition and subtraction operations. In fig. 7, cross editing can be performed based on a model of the part neural radiation field, and by cross interchange of the neural radiation fields of two 3D-shaped parts, a 3D shape editing result similar to DNA cross mutation can be achieved.
Wherein the neural network assigns each ray to NeRF of a local part in the process of generating the part structure of the parts of the three-dimensional shape for the implicit space to ensure that NeRF of each local part is unique, in this way, the embodiment of the present application may be referred to as a hard assignment mechanism. Since in embodiments of the present application the object is represented by using a set of locally defined neural radiation fields, the local radiation fields are arranged together so that the object can be reasonably rendered from the new perspective. By assigning each ray to the hard assignment mechanism of NeRF of a local part, i.e., each ray is associated with the first part through which it passes, the color of each ray is predicted from a single NeRF, thereby preventing color mixing between different parts, whereby it can be ensured that the shape and appearance of the other parts do not change when one of the parts is edited.
In the process of generating the part structure of each part of the three-dimensional shape in the implicit space, the graph neural network can carry out affine transformation on a neural radiation field model of the part corresponding to the part editing instruction under the condition of receiving the part editing instruction so as to carry out transformation operation of the part, thereby realizing flexible editing of the 3D shape.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least one component of the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time but may be performed at different times, and the order of the steps or stages is not necessarily sequential, but may be performed in turn or alternately with at least one component of other steps or stages.
Based on the same inventive concept, the embodiment of the present application also provides an apparatus for generating a three-dimensional shape for implementing the above-mentioned method for generating a three-dimensional shape. The implementation of the solution provided by the apparatus is similar to that described in the above method, so specific limitations in one or more embodiments of the apparatus for generating a three-dimensional shape provided below may be referred to above as limitations on the method for generating a three-dimensional shape, and will not be described herein.
In one embodiment, as shown in FIG. 8, there is provided an apparatus for generating a three-dimensional shape, comprising: a graph neural network acquisition module 801, a mapping module 802, and a structure generation module 803, wherein:
The image neural network acquisition module is used for acquiring an image neural network, wherein nodes of the image neural network comprise 3D-shaped components, and the nodes are connected through edges; the node attribute of each node includes: a neural radiation field model of the component, the edge attributes of the edge comprising: the relationship between two parts corresponding to the two nodes connected by the edge;
The mapping module is used for acquiring a two-dimensional image and mapping the two-dimensional image to a three-dimensional image implicit space;
And the structure generation module is used for generating the component structure of each component of the three-dimensional shape in the implicit space of the three-dimensional image through the graph neural network.
In some embodiments, the system further comprises a training module for acquiring a two-dimensional sample image; each neural radiation field model learns an implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image, respectively; rendering a two-dimensional rendered image based on a combination of the implicit representations of the neural radiation field models of the components; calculating a loss between the two-dimensional sample image and the two-dimensional rendered image; optimizing the neural radiation field model for each component based on the losses; at the end of training, the neural radiation field model of each component optimized for the last time is taken as the neural radiation field model of each component obtained by training.
In some embodiments, the training module is configured to associate, in each of the neural radiation field models, a ray with the neural radiation field model of the first component through which the ray passes in learning an implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image, respectively.
In some embodiments, the training module is configured to calculate intersections of all rays with each component, group rays that intersect the same component according to their proximity to the intersection, and associate the group with a neural radiation field model of the corresponding component.
In some embodiments, the relationship between the components comprises: connection relationships and symmetry relationships.
In some embodiments, the structure generating module is configured to perform affine transformation on the neural radiation field model of the corresponding component to perform transformation operation of the component when receiving the component editing instruction.
In some embodiments, the transformation operation includes at least one of a translation operation, a rotation operation, and a scaling operation.
The respective modules in the above-described apparatus for generating a three-dimensional shape may be realized in whole or in parts by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 9. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is for storing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of generating a three-dimensional shape.
It will be appreciated by persons skilled in the art that the architecture shown in fig. 9 is merely a block diagram of the component architecture associated with the present inventive arrangements and is not limiting as to the computer device to which the present inventive arrangements are applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In an embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, which when executed implements the steps of the method of generating a three-dimensional shape in any of the embodiments described above.
In one embodiment, a computer readable storage medium is provided having a computer program stored thereon which, when executed by a processor, implements the steps of the method of generating a three-dimensional shape in any of the embodiments described above.
In an embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, implements the steps of the method of generating a three-dimensional shape in any of the embodiments described above.
The user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party.
Those skilled in the art will appreciate that the processes implementing all or part of the methods of the above embodiments may be implemented by means of a computer program stored on a non-volatile computer readable storage medium, which when executed, may include processes of embodiments of the methods described above. Any reference to memory, database, or other medium used in embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magneto-resistive random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (PHASE CHANGE Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in various forms such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), etc. The databases referred to in the embodiments provided herein may include at least one of a relational database and a non-relational database. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processor referred to in the embodiments provided in the present application may be a general-purpose processor, a central processing unit, a graphics processor, a digital signal processor, a programmable logic unit, a data processing logic unit based on quantum computing, or the like, but is not limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of the application should be assessed as that of the appended claims.

Claims (10)

1. A method of generating a three-dimensional shape, the method comprising:
Acquiring a graph neural network, wherein nodes of the graph neural network comprise 3D-shaped components, and the nodes are connected through edges; the node attributes of the nodes include: a neural radiation field model of the component, the edge attributes of the edge comprising: the relationship between two parts corresponding to the two nodes connected by the edge;
Acquiring a two-dimensional image, and mapping the two-dimensional image to a three-dimensional image implicit space;
and implicitly generating a part structure of each part of the three-dimensional shape in space for the three-dimensional image through the graph neural network.
2. The method of claim 1, wherein the training of the neural radiation field model comprises:
acquiring a two-dimensional sample image;
each neural radiation field model learns an implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image, respectively;
Rendering a two-dimensional rendered image based on a combination of the implicit representations of the neural radiation field models of the components;
Calculating a loss between the two-dimensional sample image and the two-dimensional rendered image;
optimizing the neural radiation field model for each component based on the losses;
At the end of training, the neural radiation field model of each component optimized for the last time is taken as the neural radiation field model of each component obtained by training.
3. The method of claim 2, wherein each neural radiation field model learns an implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image, respectively, comprising:
In each of the neural radiation field models, a ray is associated with the neural radiation field model of the first component through which the ray passes in learning an implicit representation from a two-dimensional image to a three-dimensional image based on the two-dimensional sample image, respectively.
4. A method according to claim 3, wherein the ray is associated with a neural radiation field model of a first component through which the ray passes, comprising:
and calculating the intersection points of all the rays and the components, grouping the rays intersected with the same component according to the proximity degree of the rays and the intersection points, and associating the grouping with the nerve radiation field model of the corresponding component.
5. The method of any one of claims 1 to 4, wherein the relationship between the components comprises: connection relationships and symmetry relationships.
6. The method according to any one of claims 1 to 4, further comprising:
And carrying out affine transformation on the nerve radiation field model of the corresponding part when a part editing instruction is received so as to carry out transformation operation of the part.
7. The method of claim 6, wherein the transformation operation comprises at least one of a translation operation, a rotation operation, and a scaling operation.
8. An apparatus for generating a three-dimensional shape, the apparatus comprising:
The image neural network acquisition module is used for acquiring an image neural network, wherein nodes of the image neural network comprise 3D-shaped components, and the nodes are connected through edges; the node attribute of each node includes: a neural radiation field model of the component, the edge attributes of the edge comprising: the relationship between two parts corresponding to the two nodes connected by the edge;
The mapping module is used for acquiring a two-dimensional image and mapping the two-dimensional image to a three-dimensional image implicit space;
And the structure generation module is used for generating the component structure of each component of the three-dimensional shape in the implicit space of the three-dimensional image through the graph neural network.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
CN202410129429.4A 2024-01-31 2024-01-31 Method and device for generating three-dimensional shape and computer equipment Pending CN118135098A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410129429.4A CN118135098A (en) 2024-01-31 2024-01-31 Method and device for generating three-dimensional shape and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410129429.4A CN118135098A (en) 2024-01-31 2024-01-31 Method and device for generating three-dimensional shape and computer equipment

Publications (1)

Publication Number Publication Date
CN118135098A true CN118135098A (en) 2024-06-04

Family

ID=91236837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410129429.4A Pending CN118135098A (en) 2024-01-31 2024-01-31 Method and device for generating three-dimensional shape and computer equipment

Country Status (1)

Country Link
CN (1) CN118135098A (en)

Similar Documents

Publication Publication Date Title
Gupta Neural mesh flow: 3d manifold mesh generation via diffeomorphic flows
US11748934B2 (en) Three-dimensional expression base generation method and apparatus, speech interaction method and apparatus, and medium
CN108921926B (en) End-to-end three-dimensional face reconstruction method based on single image
Tao et al. Bayesian tensor approach for 3-D face modeling
CN110570522B (en) Multi-view three-dimensional reconstruction method
Li et al. 3D-ReConstnet: a single-view 3d-object point cloud reconstruction network
Lu et al. Attention-based dense point cloud reconstruction from a single image
CN111243094B (en) Three-dimensional model accurate voxelization method based on lighting method
Wang et al. Transformer for 3D point clouds
Häne et al. Hierarchical surface prediction
Liu et al. High-quality textured 3D shape reconstruction with cascaded fully convolutional networks
Shen et al. Clipgen: A deep generative model for clipart vectorization and synthesis
CN107330209B (en) Modeling wall intelligent template implementation method based on parametric design
CN116385667B (en) Reconstruction method of three-dimensional model, training method and device of texture reconstruction model
Rasoulzadeh et al. Strokes2Surface: Recovering Curve Networks From 4D Architectural Design Sketches
Hassan et al. Image generation models from scene graphs and layouts: A comparative analysis
Madhusudana et al. Revisiting dead leaves model: Training with synthetic data
CN118135098A (en) Method and device for generating three-dimensional shape and computer equipment
Chang et al. 3D hand reconstruction with both shape and appearance from an RGB image
Huang et al. SSR-2D: Semantic 3D Scene Reconstruction from 2D Images
Zhang et al. Fast Mesh Reconstruction from Single View Based on GCN and Topology Modification.
Guo Design and development of an intelligent rendering system for new year's paintings color based on b/s architecture
Sun Volumetric Seam Carving
Amiraghdam et al. LOOPS: LOcally Optimized Polygon Simplification
Huang et al. IEFM and IDS: Enhancing 3D environment perception via information encoding in indoor point cloud semantic segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination