WO2022121545A1 - Graph convolutional network-based grid segmentation method - Google Patents

Graph convolutional network-based grid segmentation method Download PDF

Info

Publication number
WO2022121545A1
WO2022121545A1 PCT/CN2021/126910 CN2021126910W WO2022121545A1 WO 2022121545 A1 WO2022121545 A1 WO 2022121545A1 CN 2021126910 W CN2021126910 W CN 2021126910W WO 2022121545 A1 WO2022121545 A1 WO 2022121545A1
Authority
WO
WIPO (PCT)
Prior art keywords
graph
layer
feature
features
convolution
Prior art date
Application number
PCT/CN2021/126910
Other languages
French (fr)
Chinese (zh)
Inventor
倪天宇
郑友怡
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2022121545A1 publication Critical patent/WO2022121545A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the invention belongs to the fields of computer graphics and computer vision, and in particular relates to a grid part segmentation method based on a graph convolution network.
  • Semantic segmentation is one of the key issues in computer vision.
  • deep learning the use of neural networks for semantic segmentation in the field of two-dimensional images has been widely explored and studied.
  • image-based operations are often not directly applicable due to its irregularity.
  • the previous methods often voxelize the 3D model or use multi-view 2D images to represent 3D objects, and then apply the methods in the 2D images.
  • the former often increases the amount of computation due to the sparsity of the data, while the latter abandons the original structure of the three-dimensional object, and the amount of computation is still large.
  • feature embedding is a more commonly used method.
  • the main idea is to obtain a representation that is close to the same category but far away from different categories, and then uses this representation to obtain the final instance segmentation.
  • Our method also references this idea and utilizes the representations obtained from feature embeddings for final part segmentation.
  • the invention proposes a grid segmentation method based on GCN network, which forms a graph representation of grids according to the adjacency relationship of surfaces, so as to realize effective feature learning through graph convolution and feature embedding.
  • the graph convolution used in the present invention uses both static edge convolution and dynamic edge convolution, and also considers the relationship between the original geometric structure and the feature space.
  • the present invention uses the method of feature embedding to constrain the distribution of features in the feature space.
  • a grid segmentation method based on graph convolutional network including the following steps:
  • Step 1 Transform the mesh model to the specified number of patches and standardize it.
  • Step 2 Convert the model processed in Step 1 into a graph representation, and perform preliminary feature extraction on each face and input it into the trained graph convolutional neural network to predict the type of part that each face in the grid belongs to.
  • the graph convolutional neural network includes:
  • the transformation module is used to make the orientation of the input preliminary features similar.
  • the graph convolution module is used to learn the features related to the adjacent faces in the real space and the adjacent faces in the feature space according to the transformed preliminary features.
  • the feature embedding module is used to obtain features of the same class with similar distances and different classes according to the features obtained by the graph convolution module.
  • the output module is used to obtain predicted segmentation results based on the features learned by the graph convolutional layer and the results of feature embedding.
  • step 1 is realized by the following sub-steps:
  • the transformation module is composed of a static convolutional layer, a maximum pooling layer and several fully connected layers, and the one static convolutional layer, a maximum pooling layer and several fully connected layers are used for Predict a rotation matrix and transform the input preliminary features by the rotation matrix.
  • the graph convolution module includes a static convolution layer, a dynamic convolution layer, a fully connected layer and a pooling layer, wherein the features learned by each layer of the static convolution layer and the dynamic convolution layer are connected and input to the full layer.
  • the connection layer summarizes and obtains the overall features through the pooling layer.
  • the feature embedding module is composed of fully connected layers.
  • the feature embedding module uses three loss functions to constrain it during training, L var constrains similar features of the same type, L dist constrains features of different categories to be farther, and L reg constrains the range of feature embedding.
  • both the static convolution layer and the dynamic convolution layer adopt an edge-conditional convolution (Edge-Conditioned Convolution) structure.
  • the invention proposes a grid segmentation method based on a graph convolutional neural network.
  • the present invention utilizes the structure of the triangular mesh itself, and then introduces graph convolution operations in the face-based graph representation. , and further represented by the method of feature embedding.
  • the present invention exploits the natural structure of grids for representation and is lightweight for both training and inference phases.
  • graph convolution uses both static convolution and dynamic convolution to learn information from the original grid structure and similarity in feature space. The present invention has achieved good results on multiple grid parts segmentation data sets.
  • FIG. 1 is a schematic diagram of the process of mesh division according to the present invention.
  • FIG. 2 is an effect diagram of the grid segmentation of the present invention, wherein the segmentation of adjacent different categories is distinguished by black and white.
  • the idea of the present invention is: using the adjacency relationship of the faces in the grid to form a graph, using the graph convolutional neural network and feature embedding to learn features on this graph, and finally using the fully connected layer to obtain a score belonging to each category for each facet, Finally, predict the category that each face belongs to, which includes the following steps:
  • Step 1 Transform the mesh model to the specified number of patches, and perform centering and scaling operations.
  • Step 2 Convert the model processed in step 1 into a graph representation, and perform preliminary feature extraction for each face and input it into the trained corresponding graph convolutional neural network.
  • the graph convolutional neural network consists of a transformation module, a graph convolution module, a feature embedding module and a fully connected layer.
  • Step 1 is a preprocessing step, and the graph convolutional neural network structure in step 2 is shown in Figure 1.
  • edge El is the initial face adjacency relationship.
  • edge convolution consider using Euclidean distance as a metric in the feature space, and consider the k faces with the closest distance as adjacent faces.
  • the transformation module combines a static convolution, a maximum pooling and several fully connected layers, predicts a rotation matrix for each input feature map, and then transforms the initial input features through the rotation matrix, so that the subsequent processing features Target as close as possible.
  • the graph convolution block is composed of three layers of static edge convolution layers and three layers of dynamic convolution layers.
  • the dynamic convolution layer selects the 10 closest surfaces in the feature space as adjacent surfaces, and finally the results of each layer are connected and input to the pooling layer to get an overall feature representation.
  • the feature embedding module is mainly based on the features learned by graph convolution, and uses the fully connected layer to predict its representation si in the feature space for each face, and the value ⁇ i related to the size of its corresponding category in the feature space.
  • the loss function for the feature embedding module is given by:
  • the final output layer takes the features learned by the previous graph convolution layer and the result of feature embedding as input, and passes through several fully connected layers to obtain the final prediction result, and calculates its cross entropy loss. Except for the last layer, leakyRELU is used as the activation function and batch normalization.
  • the main function of the fully connected layer is to remap the previously obtained features to the category space by weighting, that is, the output of the fully connected layer is a tensor of the number of patches ⁇ the number of categories, and the final output obtained after softmax is equivalent to The predicted probability of each class.
  • the data set used in training can be obtained in the following way: For the marked model, the number of faces is reduced to a similar number (the number of faces of all models is the closest to the specified number of faces), and a data set that can be used for training can be obtained.

Abstract

The present invention provides a graph convolutional network-based grid segmentation method. In the method of the present invention, the surfaces of a grid are taken as basic units, and a graph convolutional operation is performed on a dual graph formed on the basis of an adjacency relationship of the surfaces, so as to obtain a feature representation for the surfaces. In the present invention, in a feature obtaining stage, static and dynamic edge convolutions are utilized at the same time, and the capability of learning information from a potential relationship between the surfaces is also obtained while an actual adjacency relationship is utilized. In addition, in the present invention, a feature is further enhanced by utilizing the idea of feature embedding in instance segmentation, and finally, all parts of the grid are segmented by utilizing the enhanced feature. According to the present invention, a better result is obtained on data sets segmented at a plurality of parts.

Description

一种基于图卷积网络的网格分割方法A Grid Segmentation Method Based on Graph Convolutional Networks 技术领域technical field
本发明属于计算机图形学以及计算机视觉领域,尤其涉及一种基于图卷积网络的网格部位分割方法。The invention belongs to the fields of computer graphics and computer vision, and in particular relates to a grid part segmentation method based on a graph convolution network.
背景技术Background technique
语义分割是计算机视觉中的关键问题之一,随着深度学习的发展,在二维图像领域利用神经网络进行语义分割得到了广泛的探索与研究。当这一问题扩展到三维网格时,由于它的不规则性,基于图像的操作往往不能直接适用。之前的方法往往将三维模型体素化或是用多视角的二维图像对于三维物体进行表示,然后对于二维图像中的方法进行应用。前者往往因为数据的稀疏性而增大了计算量,后者放弃了三维物体的原有结构,并且计算量仍较大。对于三维的网格数据,我们以面为节点将其转化到对偶空间之中,并基于该图利用图卷积神经网络进行特征的学习。Semantic segmentation is one of the key issues in computer vision. With the development of deep learning, the use of neural networks for semantic segmentation in the field of two-dimensional images has been widely explored and studied. When this problem is extended to 3D meshes, image-based operations are often not directly applicable due to its irregularity. The previous methods often voxelize the 3D model or use multi-view 2D images to represent 3D objects, and then apply the methods in the 2D images. The former often increases the amount of computation due to the sparsity of the data, while the latter abandons the original structure of the three-dimensional object, and the amount of computation is still large. For the three-dimensional grid data, we transform it into the dual space with the face as the node, and use the graph convolutional neural network to learn the features based on the graph.
早期的图卷积神经网络往往需要静态的图结构,而最近的动态图卷积上的研究表明动态的边可以取得更好的效果。我们的方法同时利用了静态边卷积与动态边卷积对于特征进行学习,在利用原本的几何结构的同时也考虑了潜在的相似联系。Early graph convolutional neural networks often required static graph structures, while recent research on dynamic graph convolutions shows that dynamic edges can achieve better results. Our method utilizes both static edge convolution and dynamic edge convolution for feature learning, taking advantage of the original geometric structure while also considering potential similarity relationships.
在实例分割领域,特征嵌入是一种较为常用的方法,其主要思路为获得一个同类别距离较近而不同类别距离较远的表示,然后利用这一表示获得最终的实例分割。我们的方法也参考了这一思路,并且利用了特征嵌入获得的表示进行最终的部位分割。In the field of instance segmentation, feature embedding is a more commonly used method. The main idea is to obtain a representation that is close to the same category but far away from different categories, and then uses this representation to obtain the final instance segmentation. Our method also references this idea and utilizes the representations obtained from feature embeddings for final part segmentation.
发明内容SUMMARY OF THE INVENTION
本发明的提出了一种基于GCN网络的网格分割方法,将网格根据面的邻接关系形成图表示,从而通过图卷积以及特征嵌入实现有效的特征学习。同时在本发明中使用的图卷积同时使用了静态的边卷积与动态的边卷积,同时考虑了原本的几何结构与特征空间中的关系。并且本发明使用特征嵌入的方法,约束了特征在特征空间中的分布。The invention proposes a grid segmentation method based on GCN network, which forms a graph representation of grids according to the adjacency relationship of surfaces, so as to realize effective feature learning through graph convolution and feature embedding. At the same time, the graph convolution used in the present invention uses both static edge convolution and dynamic edge convolution, and also considers the relationship between the original geometric structure and the feature space. And the present invention uses the method of feature embedding to constrain the distribution of features in the feature space.
本发明是通过以下技术方案来实现的:The present invention is achieved through the following technical solutions:
一种基于图卷积网络的网格分割方法,包括以下步骤:A grid segmentation method based on graph convolutional network, including the following steps:
步骤一:将网格模型变换到指定的面片数量,并进行标准化处理。Step 1: Transform the mesh model to the specified number of patches and standardize it.
步骤二:将步骤一处理后的模型转换为图表示,并对每个面进行初步特征提取后输入到训练 好的图卷积神经网络中,对网格中每个面属于的部位种类进行预测。其中,所述图卷积神经网络包括:Step 2: Convert the model processed in Step 1 into a graph representation, and perform preliminary feature extraction on each face and input it into the trained graph convolutional neural network to predict the type of part that each face in the grid belongs to. . Wherein, the graph convolutional neural network includes:
变换模块,用于使输入的初步特征的朝向相近。The transformation module is used to make the orientation of the input preliminary features similar.
图卷积模块,用于根据变换后的初步特征学习与实际空间中邻面以及特征空间中邻面有关的特征。The graph convolution module is used to learn the features related to the adjacent faces in the real space and the adjacent faces in the feature space according to the transformed preliminary features.
特征嵌入模块,用于根据图卷积模块获得的特征获取同类相近不同类距离较远的特征。The feature embedding module is used to obtain features of the same class with similar distances and different classes according to the features obtained by the graph convolution module.
输出模块,用于根据图卷积层学到的特征以及特征嵌入的结果获得预测分割结果。The output module is used to obtain predicted segmentation results based on the features learned by the graph convolutional layer and the results of feature embedding.
进一步地,所述步骤一通过以下子步骤来实现:Further, the step 1 is realized by the following sub-steps:
(1.1)对于输入模型,将其简化或细分到指定面片数量。(1.1) For the input model, simplify or subdivide it to the specified number of patches.
(1.2)对于变换后的模型,对其进行平移和缩放操作,使模型中所有顶点的均值为0,离原点的最大距离为1。(1.2) For the transformed model, perform translation and scaling operations on it, so that the mean value of all vertices in the model is 0, and the maximum distance from the origin is 1.
进一步地,所述变换模块由一层静态卷积层、一层最大池化层与若干全连接层组成,所述一层静态卷积层、一层最大池化层与若干全连接层用于预测一个旋转矩阵,并通过旋转矩阵对于输入的初步特征进行变换。Further, the transformation module is composed of a static convolutional layer, a maximum pooling layer and several fully connected layers, and the one static convolutional layer, a maximum pooling layer and several fully connected layers are used for Predict a rotation matrix and transform the input preliminary features by the rotation matrix.
进一步地,所述图卷积模块包括静态卷积层、动态卷积层、全连接层和池化层,其中,静态卷积层、动态卷积层各层学到的特征连接并输入到全连接层进行总结并通过池化层得到总体特征。Further, the graph convolution module includes a static convolution layer, a dynamic convolution layer, a fully connected layer and a pooling layer, wherein the features learned by each layer of the static convolution layer and the dynamic convolution layer are connected and input to the full layer. The connection layer summarizes and obtains the overall features through the pooling layer.
进一步地,所述特征嵌入模块由全连接层组成。Further, the feature embedding module is composed of fully connected layers.
进一步地,所述特征嵌入模块训练时利用三个损失函数对其约束,L var约束同类特征相近,L dist约束不同类别的特征较远,L reg约束特征嵌入的范围。 Further, the feature embedding module uses three loss functions to constrain it during training, L var constrains similar features of the same type, L dist constrains features of different categories to be farther, and L reg constrains the range of feature embedding.
进一步地,本发明中,静态卷积层、动态卷积层均采取边缘条件卷积(Edge-Conditioned Convolution)结构。Further, in the present invention, both the static convolution layer and the dynamic convolution layer adopt an edge-conditional convolution (Edge-Conditioned Convolution) structure.
本发明的有益效果是:The beneficial effects of the present invention are:
本发明提出了一种基于图卷积神经网络的网格分割方法。与先前的基于多视角图像或基于体素的表示进行特征学习的基于学习的网格分割方法不同,本发明利用了三角网格本身的结构,然后在基于面的图表示中引入图卷积运算,并利用了特征嵌入的方法得到了进一步的表示。本发明利用了网格的自然结构进行表示,并且对于训练阶段和推断阶段都是轻量级的。在图卷积中,本发明同时使用了静态卷积与动态卷积,同时从原本我网格结构以及特征空间中的相似度学习信息。本发明在多个网格部位分割数据集上都取得了较好的效果。The invention proposes a grid segmentation method based on a graph convolutional neural network. Different from previous learning-based mesh segmentation methods that perform feature learning based on multi-view images or voxel-based representations, the present invention utilizes the structure of the triangular mesh itself, and then introduces graph convolution operations in the face-based graph representation. , and further represented by the method of feature embedding. The present invention exploits the natural structure of grids for representation and is lightweight for both training and inference phases. In graph convolution, the present invention uses both static convolution and dynamic convolution to learn information from the original grid structure and similarity in feature space. The present invention has achieved good results on multiple grid parts segmentation data sets.
附图说明Description of drawings
图1是本发明进行网格分割的过程示意图。FIG. 1 is a schematic diagram of the process of mesh division according to the present invention.
图2是本发明的网格分割效果图,其中相邻不同类别的分割用黑白区分。FIG. 2 is an effect diagram of the grid segmentation of the present invention, wherein the segmentation of adjacent different categories is distinguished by black and white.
具体实施方式Detailed ways
本发明的思路为:利用网格中面的邻接关系形成图,在这一图上利用图卷积神经网络以及特征嵌入学习特征,最终使用全连接层对于每个面获得属于各类别的得分,最终对于每个面属于的类别进行预测,具体包括如下步骤:The idea of the present invention is: using the adjacency relationship of the faces in the grid to form a graph, using the graph convolutional neural network and feature embedding to learn features on this graph, and finally using the fully connected layer to obtain a score belonging to each category for each facet, Finally, predict the category that each face belongs to, which includes the following steps:
步骤一:将网格模型变换到指定的面片数量,并进行居中与缩放操作。Step 1: Transform the mesh model to the specified number of patches, and perform centering and scaling operations.
步骤二:将步骤一处理后的模型转换为图表示,并对于每个面进行初步的特征提取后输入到训练好的对应图卷积神经网络中,对于网格中每个面属于的部位种类进行预测。其中,所述图卷积神经网络由变换模块、图卷积模块、特征嵌入模块以及全连接层组成。Step 2: Convert the model processed in step 1 into a graph representation, and perform preliminary feature extraction for each face and input it into the trained corresponding graph convolutional neural network. For the type of part that each face in the grid belongs to Make predictions. Wherein, the graph convolutional neural network consists of a transformation module, a graph convolution module, a feature embedding module and a fully connected layer.
步骤一为预处理步骤,步骤二中的图卷积神经网络结构如图1所示。Step 1 is a preprocessing step, and the graph convolutional neural network structure in step 2 is shown in Figure 1.
对于一个输入的网格模型M={V,F},其中V表示所有顶点,F表示所有面。对其进行特征提取后建立一个无向图G={Q,E,Φ},对于每个f i∈F,创建一个节点q i∈Q,同时对于每对相邻面f i f j创建一条无向边(q i,q j)∈E。Φ为每个节点的特征,对于f i其对应的φ i={c i,n i,v i,a i},分别表示面f i对应的质心坐标、法向、顶点坐标以及面积。 For an input mesh model M={V,F}, where V represents all vertices and F represents all faces. After feature extraction, an undirected graph G={Q, E, Φ} is established. For each f i ∈ F, a node qi Q is created, and for each pair of adjacent faces f i f j , a node is created. An undirected edge (q i ,q j )∈E. Φ is the feature of each node. For f i , its corresponding Φ i ={ ci , ni ,vi ,a i } represents the centroid coordinate, normal direction, vertex coordinate and area corresponding to the face f i , respectively.
本发明使用的图卷积网络使用了多个卷积层,采用的是(Wang,Yue,et al."Dynamic graph cnn for learning on point clouds."Acm Transactions On Graphics(tog)38.5(2019):1-12.)中的基本结构。其中对于第l层的图G l={Q l,E ll},节点特征按下式更新: The graph convolutional network used in the present invention uses a plurality of convolution layers, and what is adopted is (Wang, Yue, et al. "Dynamic graph cnn for learning on point clouds." Acm Transactions On Graphics (tog) 38.5 (2019): 1-12.) in the basic structure. Among them, for the graph G l ={Q l ,E ll } of the lth layer, the node features are updated as follows:
Figure PCTCN2021126910-appb-000001
Figure PCTCN2021126910-appb-000001
其中
Figure PCTCN2021126910-appb-000002
为带有可学习参数θ的非线性函数。这一更新方式既考虑了全局特征φ i,也考虑了反映相邻面之间关系的局部特征
Figure PCTCN2021126910-appb-000003
在静态边卷积中边E l为初始的面邻接关系,在动态边卷积中,考虑在特征空间中使用欧几里得距离作为度量,并将距离最近的k个面视为邻面。
in
Figure PCTCN2021126910-appb-000002
is a nonlinear function with a learnable parameter θ. This update method considers both global features φ i and local features that reflect the relationship between adjacent faces
Figure PCTCN2021126910-appb-000003
In static edge convolution, edge El is the initial face adjacency relationship. In dynamic edge convolution, consider using Euclidean distance as a metric in the feature space, and consider the k faces with the closest distance as adjacent faces.
变换模块组合了一次静态卷积、一次最大池化与若干全连接层,对于每个输入的特征图预测一个旋转矩阵,然后通过旋转矩阵对于初始的输入特征进行变换,以使后续进行处理的特征尽可能针对相近的朝向。The transformation module combines a static convolution, a maximum pooling and several fully connected layers, predicts a rotation matrix for each input feature map, and then transforms the initial input features through the rotation matrix, so that the subsequent processing features Target as close as possible.
图卷积块由三层静态边卷积层与三层动态卷积层构成,其中动态卷积层选择特征空间中距离 最近的10个面作为邻面,最终各层结果相连输入到池化层中得到一个总体的特征表示。The graph convolution block is composed of three layers of static edge convolution layers and three layers of dynamic convolution layers. The dynamic convolution layer selects the 10 closest surfaces in the feature space as adjacent surfaces, and finally the results of each layer are connected and input to the pooling layer to get an overall feature representation.
特征嵌入模块主要是基于图卷积学习到的特征,利用全连接层对于每个面预测其在特征空间中的表示s i,以及与其对应类别在特征空间中所占大小有关的值σ i。在训练过程中,对于特征嵌入模块的损失函数由下式给出: The feature embedding module is mainly based on the features learned by graph convolution, and uses the fully connected layer to predict its representation si in the feature space for each face, and the value σ i related to the size of its corresponding category in the feature space. During training, the loss function for the feature embedding module is given by:
Figure PCTCN2021126910-appb-000004
Figure PCTCN2021126910-appb-000004
Figure PCTCN2021126910-appb-000005
Figure PCTCN2021126910-appb-000005
Figure PCTCN2021126910-appb-000006
Figure PCTCN2021126910-appb-000006
L=α*L var+β*L dist+γ*L reg L=α* Lvar +β*L dist +γ* Lreg
该损失函数由(De Brabandere,Bert,Davy Neven,and Luc Van Gool."Semantic Instance Segmentation with a Discriminative Loss Function."arXiv(2017):arXiv-1708.)提出。其中C为类别数量,N c为c类中的面数量,R c为c类中面的集合,u c为c类中s i的均值。δ v和δ d为阈值,分别设置为0.01和3,α,β,γ为上述各部分的权重,实际训练中设置为1、1、0.001。c A、c B是表示不同类别。在上述损失函数中L var使当前嵌入与该类的均值接近,L dist使不同类别的嵌入远离,而L reg约束嵌入的范围。 This loss function is proposed by (De Brabandere, Bert, Davy Neven, and Luc Van Gool. "Semantic Instance Segmentation with a Discriminative Loss Function." arXiv(2017): arXiv-1708.). where C is the number of classes, N c is the number of faces in class c, R c is the set of faces in class c , and uc is the mean of si in class c. δv and δd are the thresholds, which are set to 0.01 and 3, respectively. α , β, and γ are the weights of the above parts, which are set to 1, 1, and 0.001 in actual training. c A and c B represent different categories. In the above loss function, L var keeps the current embedding close to the mean of the class, L dist keeps the embeddings of different classes away, and L reg constrains the range of the embedding.
在训练过程中,面i输出类别c的概率
Figure PCTCN2021126910-appb-000007
其中
Figure PCTCN2021126910-appb-000008
为该类别范围的均值。最终基于该概率以及真实类别计算一个交叉熵损失项。
During training, face i outputs the probability of class c
Figure PCTCN2021126910-appb-000007
in
Figure PCTCN2021126910-appb-000008
is the mean of the category range. Finally, a cross-entropy loss term is calculated based on this probability and the true class.
在得到特征嵌入的结果后,最终的输出层以之前的图卷积层学到的特征以及特征嵌入的结果为输入,通过若干全连接层后得到最终的预测结果,并计算其交叉熵损失。其中除了最后一层均使用了leakyRELU作为激活函数以及批处理规范化。全连接层的主要作用在于对于之前获得的特征进行加权处理重新映射到类别空间中,即全连接层的输出为面片数×类别数的张量,所获得的最终输出进行softmax后即相当于各类别预测的概率。After obtaining the result of feature embedding, the final output layer takes the features learned by the previous graph convolution layer and the result of feature embedding as input, and passes through several fully connected layers to obtain the final prediction result, and calculates its cross entropy loss. Except for the last layer, leakyRELU is used as the activation function and batch normalization. The main function of the fully connected layer is to remap the previously obtained features to the category space by weighting, that is, the output of the fully connected layer is a tensor of the number of patches × the number of categories, and the final output obtained after softmax is equivalent to The predicted probability of each class.
其中,训练时使用的数据集可以通过如下方式获取:对于有标记模型简化到相近面片数(所有模型的面数与指定面数最接近)就可以得到可用于训练的数据集。Among them, the data set used in training can be obtained in the following way: For the marked model, the number of faces is reduced to a similar number (the number of faces of all models is the closest to the specified number of faces), and a data set that can be used for training can be obtained.
部分分割结果如图2所示,从图中可以看出,本发明在多种类别的模型上都有着很好的分割效果。Part of the segmentation results are shown in Figure 2. It can be seen from the figure that the present invention has a good segmentation effect on various types of models.
显然,上述实施例仅仅是为清楚地说明所作的举例,而并非对实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其他不同形式的变化或变动。这里无需也无法把所有的实施方式予以穷举。而由此所引申出的显而易见的变化或变动仍处于本发明的保护范围。Obviously, the above-mentioned embodiments are only examples for clear description, and are not intended to limit the implementation manner. For those of ordinary skill in the art, changes or modifications in other different forms can also be made on the basis of the above description. All implementations need not and cannot be exhaustive here. However, the obvious changes or changes derived from this are still within the protection scope of the present invention.

Claims (6)

  1. 一种基于图卷积网络的网格分割方法,其特征在于,包括以下步骤:A grid segmentation method based on a graph convolutional network, characterized in that it comprises the following steps:
    步骤一:将网格模型变换到指定的面片数量,并进行标准化处理;Step 1: Transform the mesh model to the specified number of patches and standardize it;
    步骤二:将步骤一处理后的模型转换为图表示,并对每个面进行初步特征提取后输入到训练好的图卷积神经网络中,对网格中每个面属于的部位种类进行预测;其中,所述图卷积神经网络包括:Step 2: Convert the model processed in Step 1 into a graph representation, and perform preliminary feature extraction on each face and input it into the trained graph convolutional neural network to predict the type of part that each face in the grid belongs to. ; wherein, the graph convolutional neural network includes:
    变换模块,用于使输入的初步特征的朝向相近;The transformation module is used to make the orientation of the input preliminary features similar;
    图卷积模块,用于根据变换后的初步特征学习与实际空间中邻面以及特征空间中邻面有关的特征;The graph convolution module is used to learn the features related to the adjacent surfaces in the actual space and the adjacent surfaces in the feature space according to the transformed preliminary features;
    特征嵌入模块,用于根据图卷积模块获得的特征获取同类相近不同类距离较远的特征;The feature embedding module is used to obtain features with similar similar and different categories with far distance according to the features obtained by the graph convolution module;
    输出模块,用于根据图卷积层学到的特征以及特征嵌入的结果获得预测分割结果。The output module is used to obtain predicted segmentation results based on the features learned by the graph convolutional layer and the results of feature embedding.
  2. 根据权利要求1所述的一种基于图卷积网络的网格分割方法,其特征在于,所述步骤一通过以下子步骤来实现:A grid segmentation method based on a graph convolutional network according to claim 1, wherein the step 1 is realized by the following sub-steps:
    (1.1)对于输入模型,将其简化或细分到指定面片数量;(1.1) For the input model, simplify or subdivide it to the specified number of patches;
    (1.2)对于变换后的模型,对其进行平移和缩放操作,使模型中所有顶点的均值为0,离原点的最大距离为1。(1.2) For the transformed model, perform translation and scaling operations on it, so that the mean value of all vertices in the model is 0, and the maximum distance from the origin is 1.
  3. 根据权利要求1所述的一种基于图卷积网络的网格分割方法,其特征在于,所述变换模块由一层静态卷积层、一层最大池化层与若干全连接层组成,所述一层静态卷积层、一层最大池化层与若干全连接层用于预测一个旋转矩阵,并通过旋转矩阵对于输入的初步特征进行变换。The grid segmentation method based on a graph convolutional network according to claim 1, wherein the transformation module is composed of a static convolutional layer, a maximum pooling layer and a number of fully connected layers. A static convolutional layer, a maximum pooling layer and several fully connected layers are used to predict a rotation matrix and transform the input preliminary features through the rotation matrix.
  4. 根据权利要求1所述的一种基于图卷积网络的网格分割方法,其特征在于,所述图卷积模块包括静态卷积层、动态卷积层、全连接层和池化层,其中,静态卷积层、动态卷积层各层学到的特征连接并输入到全连接层进行总结并通过池化层得到总体特征。The method for grid segmentation based on a graph convolutional network according to claim 1, wherein the graph convolution module comprises a static convolution layer, a dynamic convolution layer, a fully connected layer and a pooling layer, wherein , the features learned by the static convolution layer and the dynamic convolution layer are connected and input to the fully connected layer for summarization and the overall feature is obtained through the pooling layer.
  5. 根据权利要求1所述的一种基于图卷积网络的网格分割方法,其特征在于,所述特征嵌入模块由全连接层组成。The grid segmentation method based on a graph convolutional network according to claim 1, wherein the feature embedding module is composed of a fully connected layer.
  6. 根据权利要求1所述的一种基于图卷积网络的网格分割方法,其特征在于,所述特征嵌入模块训练时利用三个损失函数对其约束,L var约束同类特征相近,L dist约束不同类别的特征较远,L reg约束特征嵌入的范围。 The method for grid segmentation based on graph convolutional network according to claim 1, wherein the feature embedding module uses three loss functions to constrain it during training, L var constrains similar features of the same type, and L dist constrains The features of different categories are far away, and Lreg constrains the range of feature embeddings.
PCT/CN2021/126910 2020-12-10 2021-10-28 Graph convolutional network-based grid segmentation method WO2022121545A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011455359.XA CN112634281A (en) 2020-12-10 2020-12-10 Grid segmentation method based on graph convolution network
CN202011455359.X 2020-12-10

Publications (1)

Publication Number Publication Date
WO2022121545A1 true WO2022121545A1 (en) 2022-06-16

Family

ID=75310104

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/126910 WO2022121545A1 (en) 2020-12-10 2021-10-28 Graph convolutional network-based grid segmentation method

Country Status (2)

Country Link
CN (1) CN112634281A (en)
WO (1) WO2022121545A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112634281A (en) * 2020-12-10 2021-04-09 浙江大学 Grid segmentation method based on graph convolution network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934826A (en) * 2019-02-28 2019-06-25 东南大学 A kind of characteristics of image dividing method based on figure convolutional network
CN109993748A (en) * 2019-03-30 2019-07-09 华南理工大学 A kind of three-dimensional grid method for segmenting objects based on points cloud processing network
CN110021069A (en) * 2019-04-15 2019-07-16 武汉大学 A kind of method for reconstructing three-dimensional model based on grid deformation
CN112634281A (en) * 2020-12-10 2021-04-09 浙江大学 Grid segmentation method based on graph convolution network

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11853903B2 (en) * 2017-09-28 2023-12-26 Siemens Aktiengesellschaft SGCNN: structural graph convolutional neural network
CN109255791A (en) * 2018-07-19 2019-01-22 杭州电子科技大学 A kind of shape collaboration dividing method based on figure convolutional neural networks
CN110838122B (en) * 2018-08-16 2022-06-14 北京大学 Point cloud segmentation method and device and computer storage medium
CN111461258B (en) * 2020-04-26 2023-04-18 武汉大学 Remote sensing image scene classification method of coupling convolution neural network and graph convolution network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109934826A (en) * 2019-02-28 2019-06-25 东南大学 A kind of characteristics of image dividing method based on figure convolutional network
CN109993748A (en) * 2019-03-30 2019-07-09 华南理工大学 A kind of three-dimensional grid method for segmenting objects based on points cloud processing network
CN110021069A (en) * 2019-04-15 2019-07-16 武汉大学 A kind of method for reconstructing three-dimensional model based on grid deformation
CN112634281A (en) * 2020-12-10 2021-04-09 浙江大学 Grid segmentation method based on graph convolution network

Also Published As

Publication number Publication date
CN112634281A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
CN110287849B (en) Lightweight depth network image target detection method suitable for raspberry pi
CN111652892A (en) Remote sensing image building vector extraction and optimization method based on deep learning
CN110543906B (en) Automatic skin recognition method based on Mask R-CNN model
Du et al. New iterative closest point algorithm for isotropic scaling registration of point sets with noise
CN114612660A (en) Three-dimensional modeling method based on multi-feature fusion point cloud segmentation
WO2022121545A1 (en) Graph convolutional network-based grid segmentation method
Li et al. Superpixel segmentation based on spatially constrained subspace clustering
CN114782417A (en) Real-time detection method for digital twin characteristics of fan based on edge enhanced image segmentation
Yu et al. MagConv: Mask-guided convolution for image inpainting
CN110348311B (en) Deep learning-based road intersection identification system and method
CN111709433A (en) Multi-feature fusion image recognition algorithm
CN116630610A (en) ROI region extraction method based on semantic segmentation model and conditional random field
CN113344947B (en) Super-pixel aggregation segmentation method
He et al. Building extraction based on U-net and conditional random fields
Ganta et al. Particle swarm optimization clustering based level sets for image segmentation
Shah et al. Overview of image inpainting techniques: A survey
CN115100136A (en) Workpiece category and pose estimation method based on YOLOv4-tiny model
CN113610711A (en) Single-image-guided three-dimensional surface reconstruction method and device
CN107220985B (en) SAR image automatic segmentation method based on graph division particle swarm optimization
CN112465836A (en) Thermal infrared semantic segmentation unsupervised field self-adaption method based on contour information
Abouqora et al. A hybrid CNN-CRF inference models for 3D mesh segmentation
Jensen et al. Deep Active Latent Surfaces for Medical Geometries
Kaushik et al. A novel beard removal method based on structural similarity and co-ordinate transformations
CN116363329B (en) Three-dimensional image generation method and system based on CGAN and LeNet-5
Xiao et al. Point Cloud Semantic Segmentation Network Based on Adaptive Convolution and Attention Mechanism

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21902252

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21902252

Country of ref document: EP

Kind code of ref document: A1