CN109658489B - Three-dimensional grid data processing method and system based on neural network - Google Patents

Three-dimensional grid data processing method and system based on neural network Download PDF

Info

Publication number
CN109658489B
CN109658489B CN201811540285.2A CN201811540285A CN109658489B CN 109658489 B CN109658489 B CN 109658489B CN 201811540285 A CN201811540285 A CN 201811540285A CN 109658489 B CN109658489 B CN 109658489B
Authority
CN
China
Prior art keywords
model
feature
fusion
data
layer perception
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811540285.2A
Other languages
Chinese (zh)
Other versions
CN109658489A (en
Inventor
高跃
冯玉彤
赵曦滨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201811540285.2A priority Critical patent/CN109658489B/en
Publication of CN109658489A publication Critical patent/CN109658489A/en
Application granted granted Critical
Publication of CN109658489B publication Critical patent/CN109658489B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Abstract

The application discloses a three-dimensional grid data processing method and system based on a neural network, wherein the system comprises the following steps: the data acquisition unit is used for acquiring data to be processed in the database, wherein the data type of the data to be processed is three-dimensional grid type, and the data to be processed comprises center point data, vertex vector data, unit normal vector data and adjacent surface data; the feature calculation unit is used for calculating fusion features corresponding to the data to be processed according to a neural network model, wherein the neural network model comprises a space description sub-model and a structure description sub-model; the global feature calculation unit is used for calculating a global feature value corresponding to the fusion feature according to a global multi-layer perception model and a maximum pooling model in the neural network model, and the global feature value is used for describing data to be processed. Through the technical scheme in the application, the feature extraction capability of the three-dimensional grid data of the three-dimensional model is improved, and the accuracy of processing the three-dimensional grid data is improved.

Description

Three-dimensional grid data processing method and system based on neural network
Technical Field
The application relates to the technical field of stereoscopic vision object recognition, in particular to a stereoscopic grid data processing system based on a neural network and a stereoscopic grid data processing method based on the neural network.
Background
Three-dimensional object representation is one of the most important and fundamental problems in the fields of computer vision and computer graphics, and in recent years, many three-dimensional object processing methods have been proposed, and three-dimensional grid data is a data type formed by a series of sets of points, sides and faces in space, wherein the points are connected by sides, and the faces are formed by closed sets of sides, and the data type is widely applied to rendering and storing three-dimensional models in the field of computer graphics, and compared with other data types, the three-dimensional grid data has more approximate and visual expression to the three-dimensional object.
In the prior art, three-dimensional objects are represented by utilizing three-dimensional grid data, and a traditional graphic method, such as a method based on spherical harmonic descriptors (Spherical Harmonic descriptor, SPH), is generally adopted, and the description precision of the traditional graphic method is lower, so that the requirement on three-dimensional object research is difficult to meet. In particular, the processing of stereoscopic mesh data using conventional graphics methods presents the following difficulties:
1) Complexity: the three-dimensional grid data consists of a plurality of elements, and different connection relations can be defined among different types of elements;
2) Irregularities: the stereoscopic mesh data differs in the number of elements in different models, and the elements are unordered.
Disclosure of Invention
The purpose of the present application is: the feature extraction capability of the three-dimensional grid data of the three-dimensional model is improved, and the accuracy of processing the three-dimensional grid data is improved.
According to a first aspect of the present application, there is provided a three-dimensional grid data processing system based on a neural network, the processing system including: the device comprises a data acquisition unit, a feature calculation unit and a global feature calculation unit; the data acquisition unit is used for acquiring data to be processed in the database, wherein the data type of the data to be processed is three-dimensional grid type, and the data to be processed comprises center point data, vertex vector data, unit normal vector data and adjacent surface data; the feature calculation unit is used for calculating fusion features corresponding to the data to be processed according to a neural network model, wherein the neural network model comprises a space description sub-model and a structure description sub-model; the global feature calculation unit is used for calculating a global feature value corresponding to the fusion feature according to a global multi-layer perception model and a maximum pooling model in the neural network model, and the global feature value is used for describing data to be processed.
In any of the above technical solutions, further, the feature calculating unit specifically further includes: a spatial feature calculation unit and a structural feature calculation unit; the space feature calculation unit is used for calculating initial space features according to the space description sub-model and the center point data; the structural feature calculation unit is used for calculating initial structural features according to a structural description sub-model, a vertex vector, a unit normal vector and adjacent surface data, wherein the structural description sub-model comprises a surface rotation convolution model and a surface kernel correlation model; the feature calculation unit is further configured to: and calculating fusion characteristics by adopting a grid aggregation model and a fusion multi-layer perception model according to the initial spatial characteristics, the initial structural characteristics and the adjacent surface data, wherein the fusion multi-layer perception model comprises a first fusion multi-layer perception model, a second fusion multi-layer perception model and a third fusion multi-layer perception model, and the fusion characteristics comprise a first fusion characteristic, a second fusion characteristic and a third fusion characteristic.
In any one of the above aspects, further, the structural feature calculating unit is configured to: according to the surface rotation convolution model, calculating a surface rotation convolution value of vertex vector data in the data to be processed, wherein a calculation formula corresponding to the surface rotation convolution value is as follows:
Figure GDA0004167043200000021
in the method, in the process of the invention,
Figure GDA0004167043200000022
for face rotation convolution value, v 1 ,v 2 ,v 3 For vertex vector data, f (·) is a convolution kernel function, g (·) is a multi-layer perceptual function;
calculating a surface kernel characteristic value according to the surface kernel correlation model, the unit normal vector data and the adjacent surface data; and calculating initial structural features according to the initial multi-layer perception model, the face rotation convolution value, the face kernel feature value and the unit normal vector data.
In any one of the above aspects, further, the feature calculation unit is configured to: calculating intermediate structural features according to the initial structural features, the adjacent surface data and the grid aggregation model, wherein the intermediate structural features comprise a first structural feature and a second structural feature; calculating a first fusion feature by using the first fusion multi-layer perception model according to the initial spatial feature and the initial structural feature; calculating a second fusion feature by using the second fusion multi-layer perception model according to the first fusion feature and the first structural feature; and calculating a third fusion feature by using the third fusion multi-layer perception model according to the second fusion feature and the second structural feature.
In any of the foregoing technical solutions, further, the global feature calculating unit specifically includes: a fusion unit and a pooling unit; the fusion unit is used for calculating a feature fusion value according to the global multi-layer perception model and the fusion features; and the pooling unit is used for carrying out pooling operation on the feature fusion value according to the maximum pooling model, and recording the pooling operation result as a global feature value.
According to a second aspect of the present application, a method for processing three-dimensional grid data based on a neural network is provided, where the method includes: step 1, obtaining data to be processed in a database, wherein the data type of the data to be processed is three-dimensional grid type, and the data to be processed comprises center point data, vertex vector data, unit normal vector data and adjacent surface data; step 2, calculating fusion characteristics corresponding to the data to be processed according to a neural network model, wherein the neural network model comprises a space description sub-model and a structure description sub-model; and 3, calculating a global characteristic value corresponding to the fusion characteristic according to a global multi-layer perception model and a maximum pooling model in the neural network model, wherein the global characteristic value is used for describing data to be processed.
In any of the above technical solutions, further, step 2 specifically includes: step 21, calculating initial spatial features according to the spatial description sub-model and the center point data; step 22, calculating initial structural features according to a structural description sub-model, a vertex vector, a unit normal vector and adjacent surface data, wherein the structural description sub-model comprises a surface rotation convolution model and a surface kernel correlation model; and step 23, calculating fusion characteristics by adopting a grid aggregation model and a fusion multi-layer perception model according to the initial spatial characteristics, the initial structural characteristics and the adjacent surface data, wherein the fusion multi-layer perception model comprises a first fusion multi-layer perception model, a second fusion multi-layer perception model and a third fusion multi-layer perception model, and the fusion characteristics comprise a first fusion characteristic, a second fusion characteristic and a third fusion characteristic.
In any of the above solutions, further, step 22 specifically includes: according to the surface rotation convolution model, calculating a surface rotation convolution value of vertex vector data in the data to be processed, wherein a calculation formula corresponding to the surface rotation convolution value is as follows:
Figure GDA0004167043200000041
in the method, in the process of the invention,
Figure GDA0004167043200000042
for face rotation convolution value, v 1 ,v 2 ,v 3 For vertex vector data, f (·) is a convolution kernel function, g (·) is a multi-layer perceptual function;
calculating a surface kernel characteristic value according to the surface kernel correlation model, the unit normal vector data and the adjacent surface data; and calculating initial structural features according to the initial multi-layer perception model, the face rotation convolution value, the face kernel feature value and the unit normal vector data.
In any of the above solutions, further, step 23 specifically includes: calculating intermediate structural features according to the initial structural features, the adjacent surface data and the grid aggregation model, wherein the intermediate structural features comprise a first structural feature and a second structural feature; calculating a first fusion feature by using the first fusion multi-layer perception model according to the initial spatial feature and the initial structural feature; calculating a second fusion feature by using the second fusion multi-layer perception model according to the first fusion feature and the first structural feature; and calculating a third fusion feature by using the third fusion multi-layer perception model according to the second fusion feature and the second structural feature.
In any of the above technical solutions, further, step 3 specifically includes: step 31, calculating a feature fusion value according to the global multi-layer perception model and the fusion features; and step 32, carrying out pooling operation on the feature fusion value according to the maximum pooling model, and marking the pooling operation result as a global feature value.
The beneficial effects of this application are: by setting the space description sub-model and the structure description sub-model, the fusion characteristics of the data to be processed are calculated, the processing difficulty caused by the complexity and the irregularity of the three-dimensional grid type data to be processed is reduced, the accuracy of feature extraction and three-dimensional model description by using the three-dimensional grid type data is improved, the feature information of the three-dimensional model can be effectively mined, the corresponding global feature value is calculated according to the multi-layer perception model and the maximum pooling model, and the obtained global feature can be applied to the classification and the retrieval of the model, so that the accuracy and the efficiency of three-dimensional model database retrieval by using the three-dimensional grid type data are improved.
Drawings
The advantages of the foregoing and/or additional aspects of the present application will become apparent and readily appreciated from the description of the embodiments, taken in conjunction with the accompanying drawings, wherein:
FIG. 1 is a schematic block diagram of a neural network-based stereoscopic mesh data processing system according to one embodiment of the present application;
FIG. 2 is a schematic diagram of stereoscopic mesh type data according to one embodiment of the application;
FIG. 3 is a schematic illustration of face rotation convolution calculations according to one embodiment of the present application;
fig. 4 is a schematic flow chart of a neural network-based stereoscopic mesh data processing method according to one embodiment of the present application.
Detailed Description
In order that the above-recited objects, features and advantages of the present application will be more clearly understood, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced otherwise than as described herein, and thus the scope of the present application is not limited to the specific embodiments disclosed below.
Embodiment one:
an embodiment one of the present application is described below with reference to fig. 1 to 3.
As shown in fig. 1, the present embodiment provides a three-dimensional mesh data processing system 100 based on a neural network, including: a data acquisition unit 10, a feature calculation unit 20, and a global feature calculation unit 30; the data acquisition unit 10 is configured to acquire data to be processed in a database, where a data type of the data to be processed is a three-dimensional grid type, and the data to be processed includes center point data, vertex vector data, unit normal vector data and adjacent surface data;
specifically, as shown in fig. 2, the three-dimensional grid data is divided into a model W by using a plane as a unit, the model W is divided into a plurality of planes, for example, a triangle network analysis method is adopted to divide the model W into a plurality of groups of data to be processed, the data to be processed by using the plane as a unit is stored in a database, and the data to be processed corresponding to any one of the divided planes g includes center point data, vertex vector data, unit normal vector data and adjacent plane data.
The center point data is the center coordinate O of the surface g, and is obtained by taking an arithmetic average value from the vertex coordinates A, B and C of the surface g;
the vertex vector data is the vector of any vertex coordinate in the center coordinate O pointing surface g
Figure GDA0004167043200000061
And->
Figure GDA0004167043200000062
I.e. v 1 ,v 2 ,v 3 The arrangement sequence of the vectors is ensured to be consistent with the direction of the unit normal vector according to the right-hand spiral rule;
the unit normal vector data is calculated by a cross multiplication operation and a principal component analysis method (Principal Component Analysis, PCA), namely, two unit vectors which are opposite in direction and perpendicular to the surface are obtained by the cross multiplication operation, and principal component analysis is carried out on all surfaces of the integral model by the principal component analysis method, so that the orientation of most surfaces is consistent, and the unit normal vector data is determined;
the adjacent surface data is adjacent surface indexes which are shared with the surfaces, namely, a plurality of surfaces forming the model W are numbered, wherein the surfaces are the surface indexes, and the indexes corresponding to the adjacent surfaces which are shared with the surfaces are e, d and f for the selected surface g, so that the adjacent surface data corresponding to the surface g are e, d and f, and if the number of the adjacent surfaces of the surface g is less than 3, the indexes of the surface g are used as the adjacent surface data.
The feature calculation unit 20 is configured to calculate spatial features and structural features corresponding to data to be processed according to a neural network model, where the neural network model includes a spatial description sub-model and a structural description sub-model;
further, the feature calculation unit 20 specifically further includes: a spatial feature calculation unit 21 and a structural feature calculation unit 22; the spatial feature calculation unit 21 is used for calculating initial spatial features according to the spatial description sub-model and the center point data;
specifically, any one of the center point data of the plurality of sets of data to be processed (three-dimensional mesh data) of the model W may be represented as α (x 1, y1, z 1) in the three-dimensional coordinate system, the center point data α is input into the spatial feature calculation unit 21, and the initial spatial feature is calculated from the spatial description sub-model by the spatial feature calculation unit 21.
The spatial feature calculation unit 21 includes a spatial multi-layer perception model composed of 3 full-connection layers, wherein the dimensions of the full-connection layers are 3, 64, and the central point data α in the multiple groups of to-be-processed data are respectively input into the spatial multi-layer perception model, and the output result of the spatial multi-layer perception model is the initial spatial feature.
The structural feature calculation unit 22 is configured to calculate an initial structural feature from a structural description sub-model, a vertex vector, a unit normal vector, and adjacent surface data, where the structural description sub-model includes a surface rotation convolution model and a surface kernel correlation model;
preferably, the structural feature calculation unit 22 is configured to: according to the surface rotation convolution model, calculating a surface rotation convolution value of vertex vector data in the data to be processed, wherein a calculation formula corresponding to the surface rotation convolution value is as follows:
Figure GDA0004167043200000071
in the method, in the process of the invention,
Figure GDA0004167043200000072
for face rotation convolution value, v 1 ,v 2 ,v 3 For vertex vector data, f (·) is a convolution kernel function, g (·) is a multi-layer perceptual function;
specifically, as shown in fig. 3, vertex vector data in any one set of stereoscopic mesh data may be represented as { v } 1 ,v 2 ,v 3 Sequentially selecting two adjacent vertex vectors according to the normal vector direction, and calculating a rotational convolution value corresponding to the vertex vectors, wherein in the embodiment, the rotational convolution value calculated by using a convolution kernel f (·) according to the arrow direction in fig. 2 is sequentially f (v) 1 ,v 2 ),f(v 2 ,v 3 ) And f (v) 3 ,v 1 ). The rotated convolution value f (v 1v2 ),f(v 2 ,v 3 ) And f (v) 3 ,v 1 ) As the input of the multi-layer perception function g (), calculating the surface rotation convolution value corresponding to the set of three-dimensional grid data
Figure GDA0004167043200000073
Calculating a surface kernel characteristic value according to the surface kernel correlation model, the unit normal vector data and the adjacent surface data;
specifically, the method for mining structural features by using reference point cloud data uses unit normal vector data and adjacent surface data as input quantities of a surface kernel correlation model, wherein a plurality of operation kernels are arranged in the surface kernel correlation model, each operation kernel consists of a group of parameters representing 4 unit normal vector distributions, and for the unit normal vector data of any group of three-dimensional grid data, a spherical coordinate system is adopted
Figure GDA0004167043200000074
Representing, wherein r is the radius of a unit normal vector, the value of r is 1, θ is the included angle between the central point O and the z axis in the three-dimensional space coordinate system, and +.>
Figure GDA0004167043200000075
Is the included angle between the central point O and the x axis after being projected to the xy plane by the connecting line of the origin of the three-dimensional space coordinate system. Euclidean space coordinates (x2, y2, z 2) is calculated as:
Figure GDA0004167043200000076
thus, any unit normal vector can be expressed as
Figure GDA0004167043200000077
For the surface g, the adjacent surfaces are the surface d, the surface e and the surface f, each operation core in the surface core correlation model is utilized to carry out core correlation operation, and when the core correlation operation is carried out, the normalization processing is carried out after the summation of each group of unit normal vectors and corresponding adjacent surface data, so as to obtain the surface core characteristic value KC (i, j) of the surface g, and the corresponding calculation formula is as follows:
Figure GDA0004167043200000081
Figure GDA0004167043200000082
wherein N is i Unit normal vector representing selected i-th face and its adjacent faces
Figure GDA0004167043200000083
Set of components, M j Vector +.>
Figure GDA0004167043200000084
Wherein vector ∈>
Figure GDA0004167043200000085
For learning unit normal vector->
Figure GDA0004167043200000086
And sigma is a super parameter for controlling the resolution accuracy of the kernel.
And calculating initial structural features according to the initial multi-layer perception model, the face rotation convolution value, the face kernel feature value and the unit normal vector data.
Specifically, the face is rotated by a convolution value
Figure GDA0004167043200000087
The method comprises the steps of taking a face kernel characteristic value KC (i, j) and unit normal vector data as input, directly connecting the three types of data, and inputting the three types of data into an initial multi-layer perception model, wherein the initial multi-layer perception model comprises 3 full-connection layers, the dimensions of the full-connection layers are 131, and corresponding initial structural features are calculated by the initial multi-layer perception model, and the dimensions of the initial structural features are 131.
The feature calculation unit 20 is also configured to: according to the initial spatial features, the initial structural features and the adjacent surface data, a grid aggregation model and a fusion multi-layer perception model are adopted to calculate fusion features, the fusion features comprise a first fusion feature, a second fusion feature and a third fusion feature, and the fusion multi-layer perception model comprises a first fusion multi-layer perception model, a second fusion multi-layer perception model and a third fusion multi-layer perception model.
Preferably, the feature calculation unit 20 is configured to: calculating intermediate structural features according to the initial structural features, the adjacent surface data and the grid aggregation model, wherein the intermediate structural features comprise a first structural feature and a second structural feature; calculating a first fusion feature by using the first fusion multi-layer perception model according to the initial spatial feature and the initial structural feature; calculating a second fusion feature by using the second fusion multi-layer perception model according to the first fusion feature and the first structural feature; and calculating a third fusion feature by using the third fusion multi-layer perception model according to the second fusion feature and the second structural feature.
Specifically, the grid aggregation model comprises a first grid aggregation model and a second grid aggregation model, the first grid aggregation model is utilized to aggregate the initial structural features and the adjacent surface data, and the aggregation result is recorded as a first structural feature; and then, the second grid aggregation model is utilized to aggregate the first structural features and the adjacent surface data, and the aggregation result is recorded as a second structural feature. The mesh aggregation model aims to enlarge the sensing area represented by the characteristics of the selected surface (surface g), more complex structural information in the selected surface (surface g) is mined, namely, after the mesh aggregation model is adopted, the characteristics of the surface g are calculated from the characteristics of four surfaces in total, namely, the surface g, the surface d, the surface e and the surface f, the sensing area represented by the characteristics of the surface g is enlarged, and the possibility in the structure is also more complex, wherein the mesh aggregation model is one of an average pooling model, a maximum pooling model and a connection fusion model.
Directly connecting the initial spatial feature and the initial structural feature, and then inputting the initial spatial feature and the initial structural feature into a first fusion multi-layer perception model for fusion, wherein the first fusion multi-layer perception model comprises two third full-connection layers, the dimensions of the full-connection layers are 128 and 256 in sequence, and the fusion result of the initial spatial feature value and the initial structural feature value is recorded as a first fusion feature;
inputting the first structural features in the first fusion features and the intermediate structural features into a second fusion multi-layer perception model for fusion after the first structural features are directly connected, wherein the second fusion multi-layer perception model comprises two full-connection layers, the dimensions of the full-connection layers are 512 and 512 in sequence, and the fusion result of the first fusion features and the first structural features is recorded as a second fusion feature;
and finally, directly connecting the second fusion feature with the second structural feature in the intermediate structural feature, and then inputting the second fusion feature into a third fusion multi-layer perception model for fusion, wherein the third fusion multi-layer perception model comprises two full-connection layers, the dimensions of the full-connection layers are 1024 and 1024 in sequence, and the fusion result of the second fusion feature and the second structural feature is recorded as a third fusion feature.
In this embodiment, the global feature calculating unit 30 is configured to calculate global feature values corresponding to spatial features and structural features according to a global multi-layer perception model and a maximum pooling model in the neural network model, where the global feature values are used to describe data to be processed.
Further, the global feature calculation unit 30 specifically includes: a fusion unit and a pooling unit; the fusion unit is used for calculating a feature fusion value according to the global multi-layer perception model and the fusion features;
specifically, the first fusion feature, the second fusion feature and the third fusion feature are directly connected and then input into a global multi-layer perception model, the global multi-layer perception model comprises two fully connected layers, wherein the dimensions of the fully connected layers are 1792 and 1024 in sequence, the global multi-layer perception model performs feature fusion, and a fusion result is recorded as a feature fusion value.
And the pooling unit is used for carrying out pooling operation on the feature fusion value according to the maximum pooling model, and recording the pooling operation result as a global feature value.
Specifically, a global feature value is obtained by performing a maximum pooling operation on the total fusion feature, and the original feature with the size of mu x f is changed into the size of 1*f, wherein f is the dimension of the fusion feature value, mu is the number of planes divided into the model W, and the maximum pooling operation eliminates the disorder of the planes in the model W. The global characteristic value can be directly input into a classification neural network to perform classification tasks, and the characteristic distance calculation is used for searching tasks, so that the global characteristic value can be applied to more complex three-dimensional object tasks.
Embodiment two:
as shown in fig. 4, the present embodiment provides a three-dimensional grid data processing method based on a neural network, including: step 1, obtaining data to be processed in a database, wherein the data type of the data to be processed is three-dimensional grid type, and the data to be processed comprises center point data, vertex vector data, unit normal vector data and adjacent surface data;
step 2, calculating fusion characteristics corresponding to the data to be processed according to a neural network model, wherein the neural network model comprises a space description sub-model and a structure description sub-model;
further, in the step 2, specifically includes:
step 21, calculating initial spatial features according to the spatial description sub-model and the center point data;
specifically, for any one of the plurality of sets of data to be processed (three-dimensional mesh data) of the model W, the center point data may be expressed as α (x 1, y1, z 1) in a three-dimensional coordinate system, the center point data α is input into a space description sub-model, and initial spatial features are calculated from the space description sub-model.
The space description sub-model comprises a space multi-layer perception model formed by 3 full-connection layers, wherein the dimensions of the full-connection layers are 3, 64 and 64 respectively, center point data alpha in a plurality of groups of to-be-processed data are respectively input into the space multi-layer perception model, and an output result of the space multi-layer perception model is an initial space characteristic.
Step 22, calculating initial structural features according to a structural description sub-model, a vertex vector, a unit normal vector and adjacent surface data, wherein the structural description sub-model comprises a surface rotation convolution model and a surface kernel correlation model;
preferably, the step 22 specifically includes:
according to the surface rotation convolution model, calculating a surface rotation convolution value of vertex vector data in the data to be processed, wherein a calculation formula corresponding to the surface rotation convolution value is as follows:
Figure GDA0004167043200000111
in the method, in the process of the invention,
Figure GDA0004167043200000112
for face rotation convolution value, v 1 ,v 2 ,v 3 For vertex vector data, f (·) is a convolution kernel function, and g (·) is a multi-layer perceptual model;
calculating a surface kernel characteristic value according to the surface kernel correlation model, the unit normal vector data and the adjacent surface data;
and calculating initial structural features according to the initial multi-layer perception model, the face rotation convolution value, the face kernel feature value and the unit normal vector data.
Specifically, vertex vector data in any one set of stereoscopic mesh data may be represented as { v } 1 ,v 2 ,v 3 Sequentially selecting two adjacent vertex vectors according to the normal vector direction, calculating a rotation convolution value corresponding to the vertex vectors, and sequentially calculating f (v) as the rotation convolution value calculated by using a convolution kernel function f (& gt) 1 , v2 ),f(v 2 ,v 3 ) And f (v) 3 , v1 ). The rotated convolution value f (v 1 ,v 2 ),f(v 2 ,v 3 ) And f (v) 3 ,v 1 ) As the input of the multi-layer perception function g (), calculating the surface rotation convolution value corresponding to the set of three-dimensional grid data
Figure GDA0004167043200000113
Method for mining structural features by using reference point cloud data, taking unit normal vector data and adjacent surface data as input quantities of a surface kernel correlation model, wherein a plurality of operation kernels are arranged in the surface kernel correlation model, each operation kernel consists of a group of parameters representing 4 unit normal vector distributions, and for the unit normal vector data of any group of three-dimensional grid data, a spherical coordinate system is adopted
Figure GDA0004167043200000114
Representing, wherein r is the radius of a unit normal vector, the value of r is 1, θ is the included angle between the central point O and the z axis in the three-dimensional space coordinate system, and +.>
Figure GDA0004167043200000115
Is the included angle between the central point O and the x axis after being projected to the xy plane by the connecting line of the origin of the three-dimensional space coordinate system. The calculation formula of Euclidean space coordinates (x 2, y2, z 2) corresponding to any unit normal vector is as follows:
Figure GDA0004167043200000116
thus, any unit normal vector can be expressed as
Figure GDA0004167043200000117
For the surface g, the adjacent surfaces are the surface d, the surface e and the surface f, each operation core in the surface core correlation model is utilized to carry out core correlation operation, and when the core correlation operation is carried out, the normalization processing is carried out after the summation of each group of unit normal vectors and corresponding adjacent surface data, so as to obtain the surface core characteristic value KC (i, j) of the surface g, and the corresponding calculation formula is as follows:
Figure GDA0004167043200000121
Figure GDA0004167043200000122
wherein N is i Unit normal vector representing selected i-th face and its adjacent faces
Figure GDA0004167043200000123
Set of components, M k Vector +.>
Figure GDA0004167043200000124
Wherein vector ∈>
Figure GDA0004167043200000125
For learning unit normal vector->
Figure GDA0004167043200000126
And sigma is a super parameter for controlling the resolution accuracy of the kernel.
Specifically, the face is rotated by a convolution value
Figure GDA0004167043200000127
The method comprises the steps of taking a face kernel characteristic value KC (i, j) and unit normal vector data as input, directly connecting the three types of data, and inputting the three types of data into an initial multi-layer perception model, wherein the initial multi-layer perception model comprises 3 full-connection layers, the dimensions of the full-connection layers are 131, and corresponding initial structural features are calculated by the initial multi-layer perception model, and the dimensions of the initial structural features are 131.
And step 23, calculating fusion characteristics by adopting a grid aggregation model and a fusion multi-layer perception model according to the initial spatial characteristics, the initial structural characteristics and the adjacent surface data, wherein the fusion multi-layer perception model comprises a first fusion multi-layer perception model, a second fusion multi-layer perception model and a third fusion multi-layer perception model, and the fusion characteristics comprise a first fusion characteristic, a second fusion characteristic and a third fusion characteristic.
Preferably, the step 23 specifically includes:
calculating intermediate structural features according to the initial structural features, the adjacent surface data and the grid aggregation model, wherein the intermediate structural features comprise a first structural feature and a second structural feature; calculating a first fusion feature by using the first fusion multi-layer perception model according to the initial spatial feature and the initial structural feature; calculating a second fusion characteristic value by using the second fusion multi-layer perception model according to the first fusion characteristic and the first structural characteristic; and according to the second feature fusion and the second structural feature, calculating a third fusion model by using the third fusion multi-layer perception model.
Specifically, the grid aggregation model comprises a first grid aggregation model and a second grid aggregation model, the first grid aggregation model is utilized to aggregate the initial structural features and the adjacent surface data, and the aggregation result is recorded as a first structural feature; and then, the second grid aggregation model is utilized to aggregate the first structural features and the adjacent surface data, and the aggregation result is recorded as a second structural feature. The mesh aggregation model aims to enlarge the sensing area represented by the characteristics of the selected surface (surface g), more complex structural information in the selected surface (surface g) is mined, namely, after the mesh aggregation model is adopted, the characteristics of the surface g are calculated from the characteristics of four surfaces in total, namely, the surface g, the surface d, the surface e and the surface f, the sensing area represented by the characteristics of the surface g is enlarged, and the possibility in the structure is also more complex, wherein the mesh aggregation model is one of an average pooling model, a maximum pooling model and a connection fusion model.
Directly connecting the initial spatial feature and the initial structural feature, and then inputting the initial spatial feature and the initial structural feature into a first fusion multi-layer perception model for fusion, wherein the first fusion multi-layer perception model comprises two full-connection layers, the dimensions of the full-connection layers are 128 and 256 in sequence, and the fusion result of the initial spatial feature value and the initial structural feature value is recorded as a first fusion feature;
inputting the first structural features in the first fusion features and the intermediate structural features into a second fusion multi-layer perception model for fusion after the first structural features are directly connected, wherein the second fusion multi-layer perception model comprises two full-connection layers, the dimensions of the full-connection layers are 512 and 512 in sequence, and the fusion result of the first fusion features and the first structural features is recorded as a second fusion feature;
and finally, directly connecting the second fusion feature with the second structural feature in the intermediate structural feature, and then inputting the second fusion feature into a third fusion multi-layer perception model for fusion, wherein the third fusion multi-layer perception model comprises two full-connection layers, the dimensions of the full-connection layers are 1024 and 1024 in sequence, and the fusion result of the second fusion feature and the second structural feature is recorded as a third fusion feature.
And 3, calculating a global characteristic value corresponding to the fusion characteristic according to a global multi-layer perception model and a maximum pooling model in the neural network model, wherein the global characteristic value is used for describing data to be processed.
Further, the step 3 specifically includes:
step 31, calculating a feature fusion value according to the global multi-layer perception model and the fusion features;
specifically, the first fusion feature, the second fusion feature and the third fusion feature are directly connected and then input into a global multi-layer perception model, the global multi-layer perception model comprises two fully connected layers, wherein the dimensions of the fully connected layers are 1792 and 1024 in sequence, the global multi-layer perception model performs feature fusion, and a fusion result is recorded as a feature fusion value.
And step 32, carrying out pooling operation on the feature fusion value according to the maximum pooling model, and marking the pooling operation result as a global feature value.
Specifically, a global feature value is obtained by performing a maximum pooling operation on the total fusion feature, and the original feature with the size of mu x f is changed into the size of 1*f, wherein f is the dimension of the fusion feature value, mu is the number of planes divided into the model W, and the maximum pooling operation eliminates the disorder of the planes in the model W. The global characteristic value can be directly input into a classification neural network to perform classification tasks, and the characteristic distance calculation is used for searching tasks, so that the global characteristic value can be applied to more complex three-dimensional object tasks.
The technical scheme of the application is described in detail above with reference to the accompanying drawings, and the application provides a three-dimensional grid data processing method and system based on a neural network, wherein the system comprises: the data acquisition unit is used for acquiring data to be processed in the database, wherein the data type of the data to be processed is three-dimensional grid type, and the data to be processed comprises center point data, vertex vector data, unit normal vector data and adjacent surface data; the feature calculation unit is used for calculating fusion features corresponding to the data to be processed according to a neural network model, wherein the neural network model comprises a space description sub-model and a structure description sub-model; the global feature calculation unit is used for calculating a global feature value corresponding to the fusion feature according to a global multi-layer perception model and a maximum pooling model in the neural network model, and the global feature value is used for describing data to be processed. Through the technical scheme in the application, the feature extraction capability of the three-dimensional grid data of the three-dimensional model is improved, and the accuracy of processing the three-dimensional grid data is improved.
The steps in the present application may be sequentially adjusted, combined, and pruned according to actual requirements.
The units in the system can be combined, divided and pruned according to actual requirements.
Although the present application is disclosed in detail with reference to the accompanying drawings, it is to be understood that such descriptions are merely illustrative and are not intended to limit the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, alterations, and equivalents to the invention without departing from the scope and spirit of the application.

Claims (6)

1. A neural network-based stereoscopic mesh data processing system, the processing system comprising: a data acquisition unit (10), a feature calculation unit (20), and a global feature calculation unit (30);
the data acquisition unit (10) is used for acquiring data to be processed in a database, wherein the data type of the data to be processed is a three-dimensional grid type, and the data to be processed comprises center point data, vertex vector data, unit normal vector data and adjacent surface data;
the feature calculation unit (20) is used for calculating fusion features corresponding to the data to be processed according to a neural network model, wherein the neural network model comprises a space description sub-model and a structure description sub-model;
wherein, the characteristic calculating unit (20) specifically further comprises: a spatial feature calculation unit (21) and a structural feature calculation unit (22);
the spatial feature calculation unit (21) is used for calculating initial spatial features according to the spatial description sub-model and the central point data; the space description sub-model comprises a space multi-layer perception model formed by 3 full-connection layers, wherein the dimensions of the full-connection layers are 3, 64 and 64 respectively, the central point data is used as the input of the space multi-layer perception model, and the initial space characteristics are used as the output of the space multi-layer perception model;
the structural feature calculation unit (22) is configured to calculate an initial structural feature according to the structural description sub-model, the vertex vector, the unit normal vector and the adjacent surface data, wherein the structural description sub-model includes a surface rotation convolution model and a surface kernel correlation model;
wherein the structural feature calculation unit (22) is configured to calculate a face rotation convolution value of the vertex vector data in the data to be processed according to the face rotation convolution model, wherein a formula of the face rotation convolution model is:
Figure FDA0004167043190000011
in the method, in the process of the invention,
Figure FDA0004167043190000012
for the face rotation convolution value, v 1 ,v 2 ,v 3 For the vertex vector data, f (·) is a convolution kernel and g (·) is a multi-layer senseKnowing the function;
calculating a surface kernel characteristic value according to the surface kernel correlation model, the unit normal vector data and the adjacent surface data, wherein a plurality of operation kernels are arranged in the surface kernel correlation model, and each operation kernel consists of a group of parameters representing 4 unit normal vector distributions;
calculating the initial structural features according to an initial multi-layer perception model, the surface rotation convolution value, the surface kernel feature value and the unit normal vector data; the initial multi-layer perception model comprises 3 full-connection layers, wherein the dimensions of the full-connection layers are 131;
the feature calculation unit (20) is further configured to: according to the initial spatial feature, the initial structural feature and the adjacent surface data, a grid aggregation model and a fusion multi-layer perception model are adopted to calculate the fusion feature, wherein the fusion multi-layer perception model comprises a first fusion multi-layer perception model, a second fusion multi-layer perception model and a third fusion multi-layer perception model, and the fusion feature comprises a first fusion feature, a second fusion feature and a third fusion feature; the grid aggregation model comprises a first grid aggregation model and a second grid aggregation model, wherein the first grid aggregation model is used for aggregating initial structural features and adjacent surface data, and the aggregation result is recorded as a first structural feature; the second grid aggregation model is used for aggregating the first structural features and the adjacent surface data, and recording an aggregation result as a second structural feature, wherein the grid aggregation model is one of an average pooling model, a maximum pooling model and a connection fusion model;
the global feature calculation unit (30) is configured to directly connect the first fusion feature, the second fusion feature and the third fusion feature, and then input the first fusion feature, the second fusion feature and the third fusion feature into a global multi-layer perception model and a maximum pooling model in the neural network model, so as to obtain a global feature value corresponding to the fusion feature, where the global feature value is used to describe the data to be processed, and the global multi-layer perception model includes two fully connected layers, and dimensions of the fully connected layers are 1792 and 1024 in sequence.
2. The neural network-based stereoscopic mesh data processing system according to claim 1, wherein the feature calculation unit (20) is configured to:
calculating intermediate structural features according to the initial structural features, the adjacent surface data and the grid aggregation model, wherein the intermediate structural features comprise a first structural feature and a second structural feature;
calculating the first fusion feature by using the first fusion multi-layer perception model according to the initial spatial feature and the initial structural feature;
calculating the second fusion feature by using the second fusion multi-layer perception model according to the first fusion feature and the first structural feature;
and calculating the third fusion feature by using the third fusion multi-layer perception model according to the second fusion feature and the second structural feature.
3. The neural network-based stereoscopic mesh data processing system according to claim 2, wherein the global feature calculation unit (30) specifically comprises: a fusion unit and a pooling unit;
the fusion unit is used for calculating a feature fusion value according to the global multi-layer perception model and the fusion feature;
and the pooling unit is used for carrying out pooling operation on the feature fusion value according to the maximum pooling model, and recording the pooling operation result as the global feature value.
4. A neural network-based three-dimensional grid data processing method, characterized in that the processing method comprises the following steps:
step 1, obtaining data to be processed in a database, wherein the data type of the data to be processed is three-dimensional grid type, and the data to be processed comprises center point data, vertex vector data, unit normal vector data and adjacent surface data;
step 2, calculating fusion characteristics corresponding to the data to be processed according to a neural network model, wherein the neural network model comprises a space description sub-model and a structure description sub-model;
step 21, calculating initial spatial features according to the spatial description sub-model and the central point data; the space description sub-model comprises a space multi-layer perception model formed by 3 full-connection layers, wherein the dimensions of the full-connection layers are 3, 64 and 64 respectively, the central point data is used as the input of the space multi-layer perception model, and the initial space characteristics are used as the output of the space multi-layer perception model;
step 22, calculating initial structural features according to the structural description sub-model, the vertex vector, the unit normal vector and the adjacent surface data, wherein the structural description sub-model comprises a surface rotation convolution model and a surface kernel correlation model;
according to the surface rotation convolution model, calculating a surface rotation convolution value of the vertex vector data in the data to be processed, wherein the surface rotation convolution model has the formula:
Figure FDA0004167043190000031
in the method, in the process of the invention,
Figure FDA0004167043190000032
for the face rotation convolution value, v 1 ,v 2 ,v 3 For the vertex vector data, f (·) is a convolution kernel function, g (·) is a multi-layer perceptual function;
calculating a surface kernel characteristic value according to the surface kernel correlation model, the unit normal vector data and the adjacent surface data, wherein a plurality of operation kernels are arranged in the surface kernel correlation model, and each operation kernel consists of a group of parameters representing 4 unit normal vector distributions;
calculating the initial structural features according to an initial multi-layer perception model, the surface rotation convolution value, the surface kernel feature value and the unit normal vector data; the initial multi-layer perception model comprises 3 full-connection layers, wherein the dimensions of the full-connection layers are 131;
step 23, according to the initial spatial feature, the initial structural feature and the adjacent surface data, calculating the fusion feature by adopting a grid aggregation model and a fusion multi-layer perception model, wherein the fusion multi-layer perception model comprises a first fusion multi-layer perception model, a second fusion multi-layer perception model and a third fusion multi-layer perception model, and the fusion feature comprises a first fusion feature, a second fusion feature and a third fusion feature; the grid aggregation model comprises a first grid aggregation model and a second grid aggregation model, wherein the first grid aggregation model is used for aggregating initial structural features and adjacent surface data, and the aggregation result is recorded as a first structural feature; the second grid aggregation model is used for aggregating the first structural features and the adjacent surface data, and recording an aggregation result as a second structural feature, wherein the grid aggregation model is one of an average pooling model, a maximum pooling model and a connection fusion model;
and 3, directly connecting the first fusion feature, the second fusion feature and the third fusion feature, and then inputting the first fusion feature, the second fusion feature and the third fusion feature into a global multi-layer perception model and a maximum pooling model in the neural network model to obtain a global feature value corresponding to the fusion feature, wherein the global feature value is used for describing the data to be processed, the global multi-layer perception model comprises two layers of full-connection layers, and the dimensions of the full-connection layers are 1792 and 1024 in sequence.
5. The neural network-based stereoscopic mesh data processing method according to claim 4, wherein the step 23 specifically includes:
calculating intermediate structural features according to the initial structural features, the adjacent surface data and the grid aggregation model, wherein the intermediate structural features comprise a first structural feature and a second structural feature;
calculating the first fusion feature by using the first fusion multi-layer perception model according to the initial spatial feature and the initial structural feature;
calculating the second fusion feature by using the second fusion multi-layer perception model according to the first fusion feature and the first structural feature;
and calculating the third fusion feature by using the third fusion multi-layer perception model according to the second fusion feature and the second structural feature.
6. The neural network-based stereoscopic mesh data processing method of claim 5, wherein the step 3 specifically comprises:
step 31, calculating a feature fusion value according to the global multi-layer perception model and the fusion feature;
and step 32, carrying out pooling operation on the feature fusion value according to the maximum pooling model, and recording the pooling operation result as the global feature value.
CN201811540285.2A 2018-12-17 2018-12-17 Three-dimensional grid data processing method and system based on neural network Active CN109658489B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811540285.2A CN109658489B (en) 2018-12-17 2018-12-17 Three-dimensional grid data processing method and system based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811540285.2A CN109658489B (en) 2018-12-17 2018-12-17 Three-dimensional grid data processing method and system based on neural network

Publications (2)

Publication Number Publication Date
CN109658489A CN109658489A (en) 2019-04-19
CN109658489B true CN109658489B (en) 2023-06-30

Family

ID=66113678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811540285.2A Active CN109658489B (en) 2018-12-17 2018-12-17 Three-dimensional grid data processing method and system based on neural network

Country Status (1)

Country Link
CN (1) CN109658489B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110570503B (en) * 2019-09-03 2021-04-16 浙江大学 Method for acquiring normal vector, geometry and material of three-dimensional object based on neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103700088A (en) * 2013-12-01 2014-04-02 北京航空航天大学 Image set unsupervised co-segmentation method based on deformable graph structure representation
CN106203516A (en) * 2016-07-13 2016-12-07 中南大学 A kind of subspace clustering visual analysis method based on dimension dependency
WO2017166586A1 (en) * 2016-03-30 2017-10-05 乐视控股(北京)有限公司 Image identification method and system based on convolutional neural network, and electronic device
CN108133188A (en) * 2017-12-22 2018-06-08 武汉理工大学 A kind of Activity recognition method based on motion history image and convolutional neural networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103700088A (en) * 2013-12-01 2014-04-02 北京航空航天大学 Image set unsupervised co-segmentation method based on deformable graph structure representation
WO2017166586A1 (en) * 2016-03-30 2017-10-05 乐视控股(北京)有限公司 Image identification method and system based on convolutional neural network, and electronic device
CN106203516A (en) * 2016-07-13 2016-12-07 中南大学 A kind of subspace clustering visual analysis method based on dimension dependency
CN108133188A (en) * 2017-12-22 2018-06-08 武汉理工大学 A kind of Activity recognition method based on motion history image and convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
用于边界面法的三维体网格生成方法;黄橙;《中国博士学位论文全文数据库信息科技辑》;20141215(第12期);全文 *

Also Published As

Publication number Publication date
CN109658489A (en) 2019-04-19

Similar Documents

Publication Publication Date Title
Sfikas et al. Exploiting the PANORAMA Representation for Convolutional Neural Network Classification and Retrieval.
US20210158023A1 (en) System and Method for Generating Image Landmarks
Ma et al. Binary volumetric convolutional neural networks for 3-D object recognition
Papadakis et al. PANORAMA: A 3D shape descriptor based on panoramic views for unsupervised 3D object retrieval
CN108073682A (en) Based on parameter view functional query database
CN111680678B (en) Target area identification method, device, equipment and readable storage medium
CN112785526B (en) Three-dimensional point cloud restoration method for graphic processing
CN112529068B (en) Multi-view image classification method, system, computer equipment and storage medium
Zhou et al. 2D compressive sensing and multi-feature fusion for effective 3D shape retrieval
CN102708589B (en) Three-dimensional target multi-viewpoint view modeling method on basis of feature clustering
CN109658489B (en) Three-dimensional grid data processing method and system based on neural network
CN111597367A (en) Three-dimensional model retrieval method based on view and Hash algorithm
CN110910463B (en) Full-view-point cloud data fixed-length ordered encoding method and equipment and storage medium
Chen et al. 3D object classification with point convolution network
KR102129060B1 (en) Content-based 3d model retrieval method using a single depth image, 3d model retrieval server for performing the methods and computer readable recording medium thereof
US20220156416A1 (en) Techniques for comparing geometric styles of 3d cad objects
CN115830375A (en) Point cloud classification method and device
CN114511745A (en) Three-dimensional point cloud classification and rotation attitude prediction method and system
CN111009004B (en) Hardware optimization method for accelerating image matching
Schmitt et al. A 3D shape descriptor based on depth complexity and thickness histograms
CN113723208A (en) Three-dimensional object shape classification method based on normative equal transformation conversion sub-neural network
Nie et al. PANORAMA-based multi-scale and multi-channel CNN for 3D model retrieval
CN111414802A (en) Protein data feature extraction method
Guo et al. SWPT: Spherical Window-Based Point Cloud Transformer
Han et al. Feature based sampling: a fast and robust sampling method for tasks using 3D point cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant