CN115511050B - Deep learning model with simplified three-dimensional model grid and training method thereof - Google Patents

Deep learning model with simplified three-dimensional model grid and training method thereof Download PDF

Info

Publication number
CN115511050B
CN115511050B CN202211170843.7A CN202211170843A CN115511050B CN 115511050 B CN115511050 B CN 115511050B CN 202211170843 A CN202211170843 A CN 202211170843A CN 115511050 B CN115511050 B CN 115511050B
Authority
CN
China
Prior art keywords
layer
model
dimensional
dimensional model
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211170843.7A
Other languages
Chinese (zh)
Other versions
CN115511050A (en
Inventor
杜创
杨会兵
魏亚兴
潘竹
何堃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Comprehensive Pipe Gallery Investment Operation Co ltd
Hefei Ruisheng Smart Technology Co ltd
Original Assignee
Hefei Comprehensive Pipe Gallery Investment Operation Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Comprehensive Pipe Gallery Investment Operation Co ltd filed Critical Hefei Comprehensive Pipe Gallery Investment Operation Co ltd
Priority to CN202211170843.7A priority Critical patent/CN115511050B/en
Publication of CN115511050A publication Critical patent/CN115511050A/en
Application granted granted Critical
Publication of CN115511050B publication Critical patent/CN115511050B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • G06T17/20Finite element generation, e.g. wire-frame surface description, tesselation

Abstract

The invention belongs to the field of computer graphics technology, deep machine learning and industrial automation virtual simulation, and provides a three-dimensional model grid simplified deep learning model which comprises a cyclic neural network layer, a full connection layer, a 3D convolution layer, an encoder layer and a decoder layer, and an attention layer. A training method for three-dimensional model mesh simplification is also provided. The similarity function of the grid structure in the invention can keep the similarity of the rendering images under the condition of greatly reducing the total number of triangular surfaces of the model, the simplification effect of the similarity function can be improved along with the increase of training samples and the increase of training rounds, and the similarity function has good simplification performance and execution efficiency and higher robustness.

Description

Deep learning model with simplified three-dimensional model grid and training method thereof
Technical Field
The invention belongs to the field of computer graphics technology, deep machine learning and industrial automation virtual simulation, and particularly relates to a deep learning model with a simplified three-dimensional model grid and a training method thereof.
Background
With the need of social development and the progress of scientific technology, three-dimensional data is a general and important data type, all aspects of social production and life comprehensively enter from the fields of original engineering application, high-performance games and the like, meanwhile, due to the high-speed development of three-dimensional data acquisition and modeling technology, the fineness of a three-dimensional model is also rapidly improved, and the three-dimensional data can be equivalently regarded as image data with expanded dimensions, so that the increase of storage capacity and calculation amount caused by the improvement of the precision is more severe.
The three-dimensional model is represented by the following steps: voxel (pixels), point Clouds (Point Clouds) and triangular meshes (Meshs), wherein the most commonly used triangular meshes are those which are used for comparing Voxel and Point cloud data and have the maximum information entropy, the data format in vector form is not limited by resolution, and various Graphic Processors (GPUs) in the current mainstream are also optimized for the triangular meshes.
In order to reduce the resource usage of each link of three-dimensional data storage, transmission, loading, rendering and the like, simplifying and multi-resolution modeling on a three-dimensional model is naturally an improvement approach selection.
There are many methods in the aspect of three-dimensional model simplification at present, such as a traditional clipping algorithm based on vertex deletion and vertex clustering, an iterative algorithm based on edge contraction, a filtering algorithm based on fourier transform or wavelet transform, and the like, and in the background that current deep machine learning is mature continuously, a method of using deep machine learning has also appeared, such as a university of bloom chart (Jittor) team proposes a convolutional neural network SubdivNet aiming at a triangular mesh patch, and an image network architecture is migrated to three-dimensional geometry learning to simplify a three-dimensional model.
However, these methods have a series of problems, such as grid surface fracture, serious contour feature loss, and large amount of details remained in the closed invisible area, while for SubdivNet, which is a method based on triangle surface reforming or three-dimensional convolution after voxel formation, when the sizes of different geometric elements in the processing model are greatly different, the method can lead to the process of triangle surface growth in the process, and the phenomenon of serious cutting of sharp parts of the model.
Meanwhile, when the methods simplify the processing, the factors in the aspect of space geometry are more considered, but insufficient importance is attached to the images rendered by the three-dimensional model, so that some models have high similarity in the aspect of space geometry, but the differences in final rendering are larger, and most remarkable, when the LoD technology is applied to perform real-time rendering, the three-dimensional model is switched among different detail levels to cause obvious perceived change.
There is a need for a simplified method of three-dimensional modeling that overcomes the shortcomings of the prior art methods described above.
Disclosure of Invention
The invention aims to construct a technology for grid simplification of a three-dimensional model by applying a deep learning technology, and corresponding target functions are obtained by designing a deep neural network model and training and adjusting. The model can simplify grids according to the input source three-dimensional model M and vertex retention coefficients, and further effectively reduces consumption of three-dimensional data in a series of processes of storage, transmission, loading, rendering and the like. The invention relates to a three-dimensional model grid simplified deep learning model and a training method thereof, wherein the three-dimensional model grid simplified deep learning model comprises a cyclic neural network layer, a full connection layer, a 3D convolution layer, an encoder layer and a decoder layer; the circulating neural network layer is used as an input layer and is used for receiving the input of the three-dimensional model network surface data with indefinite length; the two or more fully connected layers are arranged behind the cyclic neural network layer and are used for extracting the characteristics and connectivity of the three-dimensional model network surface data input in the previous layer; deforming the data after the full connection layer, and performing affine transformation of a three-dimensional space on the data input by the previous full connection layer; the three-dimensional convolution layer is sequentially used for downsampling the input, and extracting the external characteristics of the input; the 3D convolution layer is followed by an encoder layer and a decoder layer and is used for carrying out implicit encoding and decoding on the input three-dimensional model; the encoder layer and the decoder layer are followed by two or more 3D convolution layers, the 3D convolution layers sequentially supersamples the input, and constructs and refines outline features; the 3D convolution layer is followed by two or more full connection layers for carrying out translation precoding on the simplified model; the attention layer is arranged behind the full connection layer and is used for intersecting the 3D convolution layers before and after encoding and decoding, and controlling the vertex output of the three-dimensional model according to training weights so as to simplify the model; the attention layer is followed by a cyclic neural network layer, the cyclic neural network layer is used as an output layer, and the simplified model is output in the form of a floating point array of L. 3*3.
In a second aspect, a training method for three-dimensional model mesh simplification is provided, which includes an initial training stage, wherein a three-dimensional jacarella similarity coefficient is used as a loss function to accelerate convergence of a model; after the model converges, switching to a grid structure similarity function as a loss function to train; in training using the rasterized structure similarity function, the model is further optimized by randomizing the illumination and angle of the rasterized environment.
Further, the code of the rasterized structure similarity function is defssim_rasterize (y_true, y_pred):
return ssim(rasterize(y_true),rasterize(y_pred))。
further, the rasterization structure similarity function is to respectively rasterize a true value y and a predicted value y' of the multidimensional array representation of the three-dimensional model, and then the structural similarity of the two.
Further, the step of calculating the rasterized structure similarity function is to randomly generate each element of the rasterized shader: the shader type, camera type and position, light position, and intensity; rasterizing the input three-dimensional model Y and the output three-dimensional model Y' under the shader respectively; substituting the rasterized rendering produced plane images I and I' into a structural similarity function to obtain a loss value of the plane images; the same training step can grid the three-dimensional model Y and the output three-dimensional model Y' twice or more, reduce and average the loss value and accelerate the training process.
Further, before the initial training phase begins, a data sample file needs to be preprocessed, the data sample file is converted into a data form of an L x 3*3 floating point number array, a single model is normalized, and the single model is placed in a three-dimensional coordinate origin point at the center, and the single model is scaled to a data space with variance of 1 and 0.
The beneficial effects are that:
the method is similar to an NLP (natural language processing) model, extracts abstracts from long texts, and meanwhile, the similarity function of the grid structure in the method is directly from the purpose of simplifying from a three-dimensional model, so that the similarity of rendered images of the model can be kept under the condition that the total number of triangular faces of the model is greatly reduced, and the simplification effect of the model can be improved along with the increase of training samples and the increase of training rounds. Compared with the prior method, the method has good simplified performance and execution efficiency, and has higher robustness.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a logic diagram of a three-dimensional model grid simplified deep learning model;
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
Because of the principle of human eyes perception of three-dimensional images and the limitation of the existing presentation equipment, the presentation form of the current three-dimensional data is still mainly based on plane images, so that it is a feasible target to reduce the triangular surface number of the three-dimensional model as much as possible on the basis of keeping the details of the three-dimensional model rendering generation two-dimensional images, in other words, if the three-dimensional model X and the simplified model X ' are closer to each other when plane images I and I ' are produced by rasterization rendering, the better the simplification effect can be considered, and meanwhile, the lower the ratio of the total number N ' of the triangular surfaces of the simplified model to the total number N of the triangular surfaces of the original three-dimensional model can be considered to be more effective. The invention uses a three-dimensional model simplification method driven by an image, and simultaneously uses a deep learning mode to carry out fitting of a simplification function f to solve the problem that a plurality of related factors cannot be hard-coded, and the technical scheme of the invention mainly comprises the following steps:
three-dimensional model mesh reduced deep learning model logic as shown in fig. 1:
1. the input layer is a circulating neural network layer which receives the input of the three-dimensional model network surface data with indefinite length;
2. the three-dimensional model network surface data input by the previous layer is extracted by two or more full-connection layers;
3. then deforming the data, and carrying out affine transformation of a three-dimensional space on the data input by the previous full-connection layer;
4. the three-dimensional convolution layer is sequentially used for downsampling the input, and extracting the external characteristics of the input;
5. followed by an encoder layer and a decoder layer for implicit encoding and decoding of the input three-dimensional model;
6. the three-dimensional (3D) convolution layers are arranged behind the three-dimensional (3D) convolution layers, the 3D convolution layers sequentially supersamples input, and builds and refines outline features;
7. two or more full connection layers are arranged behind the model, and the full connection layers are used for carrying out translation precoding on the simplified model;
8. and then, the attention layer adopts a cross attention mechanism, takes a full connection layer and a bottleneck full connection layer which are behind the input layer as input, obtains cross weights of hidden information between two layers, and filters the weights at the output by a learned threshold value, thereby realizing the extraction of key information and realizing the purpose of simplifying a three-dimensional model.
9. The output layer is a cyclic neural network layer, and the simplified model is output in the form of a floating point array of L3*3.
The three-dimensional grid computing depth neural network model is trained after being constructed, and the training method is provided for the situation that the depth neural network model is difficult to converge:
1. because the input X and the expected output Y of the three-dimensional grid computing depth neural network model are both original three-dimensional models, the method does not need to label data;
2. the data sample of the three-dimensional grid computing depth neural network model can be selected from an open data set, an own data set or an automatically generated three-dimensional model, as long as sample data has universality;
3. before training, the data sample file is preprocessed, converted into a data form of an L-3*3 floating point number array, normalized, and placed in a three-dimensional coordinate origin point to be scaled to a data space with variance of 1 and 0.
4. In the initial training stage of the three-dimensional grid computing depth neural network model, 3DIoU (three-dimensional Jacar similarity coefficient) is used as a loss function to accelerate the convergence of the model;
5. after the three-dimensional grid computing depth neural network model converges, switching to a grid structure similarity function as a loss function for training;
6. in training using the rasterized structure similarity function as a loss function, the model is further optimized by randomizing the illumination and angle of the rasterized environment.
The principle of the rasterized structure similarity function is that a true value y and a predicted value y 'which are expressed by a multi-dimensional array of a three-dimensional model are respectively rasterized, and then the structural similarity of the true value y and the predicted value y' is calculated as follows:
1. randomly generating elements of a rasterized shader: the type of shader, the type and location of camera, the location of light, intensity, etc.;
2. rasterizing the input three-dimensional model Y and the output three-dimensional model Y' under the shader respectively;
3. substituting the rasterized rendering produced plane images I and I' into an SSIM (structural similarity) function to obtain a loss value;
4. the same training step can grid the three-dimensional model Y and the output three-dimensional model Y' twice or more, and reduce and average the loss value so as to accelerate the training process;
the definition code is as follows, defssim_ras_ize (y_true, y_pred):
return ssim(rasterize(y_true),rasterize(y_pred))
finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the technical solutions described in the foregoing embodiments, or that equivalents may be substituted for part of the technical features thereof. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. The training method of the deep learning model based on three-dimensional model grid simplification is characterized by comprising an initial training stage, wherein the convergence of the model is accelerated by using a three-dimensional Jacar similarity coefficient as a loss function; after the model converges, switching to a grid structure similarity function as a loss function to train; in the process of training by using the similarity function of the grid structure, the model is further optimized by randomizing the illumination and angle of the grid environment;
the three-dimensional model grid simplified deep learning model comprises a cyclic neural network layer, a full connection layer, a 3D convolution layer, an encoder layer and a decoder layer, and an attention layer;
the circulating neural network layer is used as an input layer and is used for receiving the input of the three-dimensional model network surface data with indefinite length;
the two or more full-connection layers are arranged behind the cyclic neural network layer and are used for extracting the characteristics and connectivity of the three-dimensional model network surface data input in the previous layer;
the data is deformed after the full connection layer, and the data is used for carrying out affine transformation of a three-dimensional space on the data input by the previous full connection layer;
the three-dimensional convolution layer is sequentially used for downsampling the input, and extracting the external characteristics of the input;
the 3D convolution layer is followed by an encoder layer and a decoder layer and is used for carrying out implicit coding and decoding on the input three-dimensional model;
the encoder layer and the decoder layer are followed by two or more 3D convolution layers, the 3D convolution layers sequentially oversample input, and build and refine outline features;
the 3D convolution layer is followed by two or more full connection layers for carrying out translation precoding on the simplified model;
the attention layer is arranged behind the full-connection layer and is used for intersecting the full-connection layers before and after coding and decoding, and controlling the vertex output of the three-dimensional model according to training weights so as to simplify the model;
the attention layer is followed by a cyclic neural network layer, the cyclic neural network layer is used as an output layer, and the simplified model is output in the form of a floating point array of L. 3*3.
2. The training method of a three-dimensional model mesh-based simplified deep learning model according to claim 1, wherein the code of the rasterized structure similarity function is:
def ssim_rasterize(y_true,y_pred):
return ssim(rasterize(y_true),rasterize(y_pred))。
3. the training method of a deep learning model based on three-dimensional model mesh simplification according to claim 2, wherein the rasterized structure similarity function is a structure similarity obtained by respectively rasterizing a true value y and a predicted value y' of a multi-dimensional array representation of the three-dimensional model.
4. A training method of a deep learning model based on three-dimensional model mesh simplification according to claim 3, characterized in that the step of calculating the rasterized structure similarity function is to randomly generate each element of a rasterized shader: the shader type, camera type and position, light position, and intensity; rasterizing the input three-dimensional model Y and the output three-dimensional model Y' under the shader respectively; substituting the rasterized rendering produced plane images I and I' into a structural similarity function to obtain a loss value of the plane images; the same training step can grid the three-dimensional model Y and the output three-dimensional model Y' twice or more, reduce and average the loss value and accelerate the training process.
5. A training method for a three-dimensional model based on mesh reduction of deep learning model according to claim 1, characterized in that, before the initial training phase starts, a data sample file is preprocessed, the data sample file is converted into a data form of an L x 3*3 floating point number array, the single model is normalized, and the center of the single model is placed in a three-dimensional coordinate origin point to scale the single model to a data space with variance of 1 and 0.
CN202211170843.7A 2022-09-23 2022-09-23 Deep learning model with simplified three-dimensional model grid and training method thereof Active CN115511050B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211170843.7A CN115511050B (en) 2022-09-23 2022-09-23 Deep learning model with simplified three-dimensional model grid and training method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211170843.7A CN115511050B (en) 2022-09-23 2022-09-23 Deep learning model with simplified three-dimensional model grid and training method thereof

Publications (2)

Publication Number Publication Date
CN115511050A CN115511050A (en) 2022-12-23
CN115511050B true CN115511050B (en) 2023-07-21

Family

ID=84506110

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211170843.7A Active CN115511050B (en) 2022-09-23 2022-09-23 Deep learning model with simplified three-dimensional model grid and training method thereof

Country Status (1)

Country Link
CN (1) CN115511050B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10643384B2 (en) * 2018-02-08 2020-05-05 Google Llc Machine learning-based geometric mesh simplification
US11043028B2 (en) * 2018-11-02 2021-06-22 Nvidia Corporation Reducing level of detail of a polygon mesh to decrease a complexity of rendered geometry within a scene
US10916054B2 (en) * 2018-11-08 2021-02-09 Adobe Inc. Three-dimensional mesh deformation using deep learning neural networks
CN112634128B (en) * 2020-12-22 2022-06-14 天津大学 Stereo image redirection method based on deep learning

Also Published As

Publication number Publication date
CN115511050A (en) 2022-12-23

Similar Documents

Publication Publication Date Title
Kato et al. Neural 3d mesh renderer
Karnewar et al. Relu fields: The little non-linearity that could
Wang et al. Nerf-art: Text-driven neural radiance fields stylization
Zhu et al. Hifa: High-fidelity text-to-3d with advanced diffusion guidance
CN105787865A (en) Fractal image generation and rendering method based on game engine and CPU parallel processing
EP1986158B1 (en) Decorating computer generated character with surface attached features
Bernhardt et al. Real-time terrain modeling using CPU: GPU coupled computation
CN110827295A (en) Three-dimensional semantic segmentation method based on coupling of voxel model and color information
Li et al. Vox-surf: Voxel-based implicit surface representation
CN116958453B (en) Three-dimensional model reconstruction method, device and medium based on nerve radiation field
CN113362242B (en) Image restoration method based on multi-feature fusion network
Wu et al. Pointconvformer: Revenge of the point-based convolution
EP1986157B1 (en) Placing skin-attached features on a computer generated character
CN111275798A (en) Three-dimensional model and special effect interaction system and method
CN115984441A (en) Method for rapidly reconstructing textured three-dimensional model based on nerve shader
CN115511050B (en) Deep learning model with simplified three-dimensional model grid and training method thereof
CN110322548A (en) A kind of three-dimensional grid model generation method based on several picture parametrization
CN113327314B (en) Cloud representation and real-time drawing method for covering full airspace based on hierarchy
Li et al. Intelligent combination of discrete LoD model for 3D visualization based on visual perception and information entropy fusion
CN113808006A (en) Method and device for reconstructing three-dimensional grid model based on two-dimensional image
CN112634456A (en) Real-time high-reality drawing method of complex three-dimensional model based on deep learning
Aldrich et al. Collision-Driven Volumetric Deformation on the GPU.
Ma A comparison of art style transfer in Cycle-GAN based on different generators
Zhang et al. Vosh: Voxel-Mesh Hybrid Representation for Real-Time View Synthesis
Elbehery et al. Low Complexity Image Inpainting Using AutoEncoder

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20240423

Address after: 230000, Guanlang Building, No. 111 Tangmo Road, Baohe District, Hefei City, Anhui Province

Patentee after: Hefei comprehensive pipe gallery Investment Operation Co.,Ltd.

Country or region after: China

Patentee after: Hefei Ruisheng Smart Technology Co.,Ltd.

Address before: 230092 Jiantou Building, No. 229, Wuhan Road, Baohe District, Hefei, Anhui

Patentee before: Hefei comprehensive pipe gallery Investment Operation Co.,Ltd.

Country or region before: China