CN117934488A - Construction and optimization method of three-dimensional shape segmentation framework based on semi-supervision and electronic equipment - Google Patents
Construction and optimization method of three-dimensional shape segmentation framework based on semi-supervision and electronic equipment Download PDFInfo
- Publication number
- CN117934488A CN117934488A CN202311667459.2A CN202311667459A CN117934488A CN 117934488 A CN117934488 A CN 117934488A CN 202311667459 A CN202311667459 A CN 202311667459A CN 117934488 A CN117934488 A CN 117934488A
- Authority
- CN
- China
- Prior art keywords
- dimensional shape
- data set
- segmentation network
- module
- segmentation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 188
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000010276 construction Methods 0.000 title claims description 10
- 238000005457 optimization Methods 0.000 title claims description 7
- 238000012549 training Methods 0.000 claims abstract description 21
- 238000004364 calculation method Methods 0.000 claims abstract description 12
- 230000006870 function Effects 0.000 claims description 31
- 239000011159 matrix material Substances 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 12
- 238000009877 rendering Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 2
- 238000005070 sampling Methods 0.000 claims description 2
- 238000005192 partition Methods 0.000 claims 3
- 238000004590 computer program Methods 0.000 description 15
- 238000010586 diagram Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 2
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 2
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 2
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 2
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000010267 cellular communication Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10012—Stereo images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域Technical Field
本申请涉及三维形状分割领域,特别是涉及基于半监督的三维形状分割框架的构建与优化方法及电子设备。The present application relates to the field of three-dimensional shape segmentation, and in particular to a construction and optimization method of a semi-supervised three-dimensional shape segmentation framework and an electronic device.
背景技术Background technique
三维形状分割任务涉及将三维形状划分为有意义的部分,对于高效处理三维形状至关重要。它使我们能够更轻松地理解形状的内在特性,例如其拓扑结构。因此,各种任务,包括网格编辑、重建、建模、变形和形状检索等,都依赖于三维形状分割以获得令人满意的结果。目前,形状分割已经成为最受欢迎和充满挑战的研究领域之一。The 3D shape segmentation task involves dividing a 3D shape into meaningful parts and is crucial for efficient processing of 3D shapes. It enables us to understand the intrinsic properties of a shape, such as its topological structure, more easily. Therefore, various tasks, including mesh editing, reconstruction, modeling, deformation, and shape retrieval, rely on 3D shape segmentation to obtain satisfactory results. Currently, shape segmentation has become one of the most popular and challenging research areas.
传统的三维形状分割方法通常包括三个主要步骤。首先,使用手工制作的形状描述符将形状上的每个面映射到相应的特征向量,随后,在特征空间中应用聚类或分类方法来为每个特征向量分配标签,最后,根据相应的特征向量的标签,对三维形状中的每个面进行标记。然而,机器学习的最新进展导致了基于学习的分割方法的发展,特别是基于深度学习架构的方法。与传统的几何优化方法相比,这些方法在性能上表现出了显著的改进Traditional 3D shape segmentation methods usually consist of three main steps. First, each face on the shape is mapped to a corresponding feature vector using hand-crafted shape descriptors. Subsequently, clustering or classification methods are applied in the feature space to assign labels to each feature vector. Finally, each face in the 3D shape is labeled according to the label of the corresponding feature vector. However, recent advances in machine learning have led to the development of learning-based segmentation methods, especially those based on deep learning architectures. These methods have shown significant improvements in performance compared to traditional geometric optimization methods.
尽管基于学习的分割方法,特别是深度学习方法,已经取得了令人印象深刻的成果,但它们存在一个主要缺点,即需要大量与目标形状相似的完全标记的训练数据,这可能会在手动标记方面带来显著的时间和成本负担。此外,这些方法通常需要为每个目标的三维形状准备不同的训练形状集,增加了训练过程的复杂性。Although learning-based segmentation methods, especially deep learning methods, have achieved impressive results, they have a major drawback, which is the need for a large amount of fully labeled training data with similar shapes to the target, which may bring significant time and cost burden in terms of manual labeling. In addition, these methods usually require preparing different training shape sets for each 3D shape of the target, which increases the complexity of the training process.
发明内容Summary of the invention
为了克服当前基于学习的分割方法的局限性,降低训练过程的复杂度,提高分割的精准度,根据本申请实施例的一个方面,本发明提出了一种基于半监督的三维形状分割框架的构建方法,所述三维形状分割框架包括辅助分割网络模块、自细化模块与包含主分割网络与计算模块的主分割网络模块;所述辅助分割网络模块包括用于预测伪标签的辅助分割网络、用于对三维形状进行投影的投影模块与反投影模块;In order to overcome the limitations of the current learning-based segmentation method, reduce the complexity of the training process, and improve the accuracy of segmentation, according to one aspect of the embodiment of the present application, the present invention proposes a method for constructing a semi-supervised three-dimensional shape segmentation framework, wherein the three-dimensional shape segmentation framework includes an auxiliary segmentation network module, a self-refinement module, and a main segmentation network module including a main segmentation network and a calculation module; the auxiliary segmentation network module includes an auxiliary segmentation network for predicting pseudo labels, a projection module for projecting three-dimensional shapes, and a back-projection module;
所述构建方法包括:The construction method comprises:
获取第一数据集与第二数据集;所述第一数据集中的数据为完整标注的三维形状;所述第二数据集中的数据为带有稀疏Scribble标注的三维形状;Acquire a first data set and a second data set; the data in the first data set are completely annotated three-dimensional shapes; the data in the second data set are three-dimensional shapes with sparse scribble annotations;
对第一数据集中各三维形状的参考标签进行采样,以生成对应的带有稀疏Scribble标注的三维形状,得到第三数据集;Sampling the reference label of each three-dimensional shape in the first data set to generate a corresponding three-dimensional shape with sparse scribble annotations, thereby obtaining a third data set;
将第三数据集输入辅助分割网络模块以训练辅助分割网络,其中,所述投影模块获取第三数据集中各三维形状对应的多视图投影图像得到第四数据集,并建立参考矩阵;所述参考矩阵中的数据包括在获取多视图投影图像的过程中,每个像素对应在原始三维形状上的顶点坐标;所述辅助分割网络基于第一目标函数与第四数据集中的多视图投影图像进行训练,以预测出多视图投影图像中各二维图像对应的预测标签;所述反投影模块利用参考矩阵将预测标签映射回对应的三维形状的面片上,以对第三数据集中三维形状的各面片生成对应的完整标注;The third data set is input into the auxiliary segmentation network module to train the auxiliary segmentation network, wherein the projection module obtains the multi-view projection images corresponding to each three-dimensional shape in the third data set to obtain a fourth data set, and establishes a reference matrix; the data in the reference matrix includes the vertex coordinates of each pixel corresponding to the original three-dimensional shape in the process of obtaining the multi-view projection images; the auxiliary segmentation network is trained based on the first objective function and the multi-view projection images in the fourth data set to predict the predicted labels corresponding to each two-dimensional image in the multi-view projection images; the back-projection module uses the reference matrix to map the predicted labels back to the corresponding three-dimensional shape patches to generate corresponding complete annotations for each patch of the three-dimensional shape in the third data set;
将第二数据集输入训练后的辅助分割网络模块,以输出包含完整标注的三维形状即伪标签,得到第五数据集;Inputting the second data set into the trained auxiliary segmentation network module to output a three-dimensional shape containing complete annotations, i.e., a pseudo label, to obtain a fifth data set;
通过第一数据集与第五数据集训练主分割网络,以针对三维形状的每个面片输出对应的面片特征,并预测出对应的预测标签;The main segmentation network is trained by using the first data set and the fifth data set to output corresponding patch features for each patch of the three-dimensional shape and predict corresponding prediction labels;
通过自细化模块将辅助分割网络模块输出的伪标签与主分割网络输出的面片特征进行融合,得到融合后的伪标签;The pseudo labels output by the auxiliary segmentation network module are fused with the patch features output by the main segmentation network through the self-refinement module to obtain the fused pseudo labels.
通过计算模块利用真实标签、融合后的伪标签以及主分割网络生成的预测标签计算交叉熵损失值,并根据交叉熵损失值调整主分割网络的网络参数。The calculation module uses the true label, the fused pseudo label and the predicted label generated by the main segmentation network to calculate the cross entropy loss value, and adjusts the network parameters of the main segmentation network according to the cross entropy loss value.
进一步地,所述多视图投影图像的二维图像包括深度图与渲染图;Further, the two-dimensional image of the multi-view projection image includes a depth map and a rendering image;
所述反投影模块,具体用于根据每个像素在参考矩阵(对应位置)中记录的顶点坐标,寻找三维形状中对应坐标的面片,得到二维图像中预测标签与三维形状上每个面片的语义标签之间的对应关系,并基于对应关系将预测标签映射至对应的三维形状面片上。The back-projection module is specifically used to find the facets of corresponding coordinates in the three-dimensional shape according to the vertex coordinates recorded by each pixel in the reference matrix (corresponding position), obtain the correspondence between the predicted label in the two-dimensional image and the semantic label of each facet on the three-dimensional shape, and map the predicted label to the corresponding three-dimensional shape facet based on the correspondence.
进一步地,所述辅助分割网络包括:Furthermore, the auxiliary segmentation network includes:
Encoder模块,用于接收多视图投影图像,并提取多视图投影图像中各二维图像对应的图像特征信息;The Encoder module is used to receive the multi-view projection image and extract the image feature information corresponding to each two-dimensional image in the multi-view projection image;
Decoder模块,用于根据图像特征信息对其对应的二维图像进行标签预测,得到各二维图像对应的预测标签。The Decoder module is used to predict the label of the corresponding two-dimensional image according to the image feature information to obtain the predicted label corresponding to each two-dimensional image.
进一步地,所述主分割网络包括:Furthermore, the main segmentation network includes:
由四层全连接层组成的主要细分模块,用于提取三维形状中每个面片对应的面片特征;The main subdivision module consists of four fully connected layers to extract patch features corresponding to each patch in the 3D shape;
Softmax模块,用于根据每个面片对应的面片特征预测出对应的预测标签;Softmax module, used to predict the corresponding prediction label according to the patch features corresponding to each patch;
所述计算模块具体用于:The calculation module is specifically used for:
针对第一数据集,通过真实标签与Softmax模块生成的预测标签计算交叉熵损失值,针对第五数据集,通过融合后的伪标签与Softmax模块生成的预测标签计算交叉熵损失值。For the first data set, the cross entropy loss value is calculated by the real label and the predicted label generated by the Softmax module. For the fifth data set, the cross entropy loss value is calculated by the fused pseudo label and the predicted label generated by the Softmax module.
进一步地,所述自细化模块包括两层卷积层,用于在预测过程中动态地学习主分割网络和辅助分割网络对带有稀疏Scribble标注的三维形状的预测结果的权重分配。Furthermore, the self-refinement module includes two convolutional layers for dynamically learning the weight distribution of the prediction results of the main segmentation network and the auxiliary segmentation network for the three-dimensional shape with sparse Scribble annotations during the prediction process.
进一步地,所述第一目标函数的公式表达式为:Furthermore, the formula expression of the first objective function is:
式中,F表示第一数据集,s(f)表示第一数据集中三维形状的面片;x(f)表示s(f)面片的特征向量,y(f)表示辅助分割网络对x(f)的预测标签,θ表示辅助分割网络的网络参数,paux表示辅助分割网络对应的预测函数。Where F represents the first data set, s (f) represents the patch of the three-dimensional shape in the first data set; x (f) represents the feature vector of the patch s (f) , y (f) represents the predicted label of x (f) by the auxiliary segmentation network, θ represents the network parameters of the auxiliary segmentation network, and paux represents the prediction function corresponding to the auxiliary segmentation network.
根据本申请实施例的另一个方面,还提出了一种基于半监督的三维形状分割框架的优化方法,包括:According to another aspect of the embodiment of the present application, an optimization method based on a semi-supervised three-dimensional shape segmentation framework is also proposed, including:
获取目标函数;Get the target function;
通过目标函数优化三维形状分割框架;其中,所述三维形状分割框架通过上文所述的基于半监督的三维形状分割框架的构建方法得到。The three-dimensional shape segmentation framework is optimized through the objective function; wherein the three-dimensional shape segmentation framework is obtained through the construction method of the three-dimensional shape segmentation framework based on semi-supervision described above.
进一步地,所述目标函数的公式表达式为:Furthermore, the objective function is expressed as follows:
式中,与φ均表示主分割网络的网络参数,λ表示自细化模块的网络参数,F表示第一数据集,ppri表示主分割网络的预测,x(f)表示第一数据集F中三维形状的面片的特征向量,y(f)表示主分割网络对x(f)的预测标签,S表示第二数据集或第三数据集,qconv表示自细化模块产生的动态权重,x(s)表示第二数据集或第三数据集中三维形状的面片的特征向量,s(s)表示第二数据集或第三数据集中三维形状的面片,Y表示辅助分割网络对x(s)的预测,s(f)表示第一数据集中三维形状的面片。In the formula, and φ both represent the network parameters of the main segmentation network, λ represent the network parameters of the self-refinement module, F represents the first dataset, p pri represents the prediction of the main segmentation network, x (f) represents the feature vector of the 3D shape patch in the first dataset F, y (f) represents the predicted label of x (f) by the main segmentation network, S represents the second dataset or the third dataset, q conv represents the dynamic weight generated by the self-refinement module, x (s) represents the feature vector of the 3D shape patch in the second dataset or the third dataset, s (s) represents the 3D shape patch in the second dataset or the third dataset, Y represents the prediction of x (s) by the auxiliary segmentation network, and s (f) represents the 3D shape patch in the first dataset.
根据本申请实施例的另一个方面,还提出了一种基于半监督的三维形状分割方法,包括:According to another aspect of the embodiment of the present application, a semi-supervised 3D shape segmentation method is also proposed, comprising:
获取未标注的待分割三维形状;Obtain an unlabeled three-dimensional shape to be segmented;
将未标注的待分割三维形状输入主分割网络进行预测,得到包含完整标注的三维形状;其中,所述主分割网络通过上文所述的基于半监督的三维形状分割框架的构建方法训练得到。The unlabeled three-dimensional shape to be segmented is input into the main segmentation network for prediction to obtain a three-dimensional shape containing complete annotations; wherein the main segmentation network is trained by the construction method based on the semi-supervised three-dimensional shape segmentation framework described above.
根据本申请实施例的再一个方面,还提出了一种电子设备,包括:处理器,以及存储程序的存储器,所述程序包括指令,所述指令在由所述处理器执行时使所述处理器执行以上任一实施例中的方法。According to another aspect of the embodiments of the present application, an electronic device is proposed, including: a processor, and a memory for storing a program, wherein the program includes instructions, and when the instructions are executed by the processor, the processor executes the method in any of the above embodiments.
与现有技术相比,本申请提出的技术方案具有以下技术效果:Compared with the prior art, the technical solution proposed in this application has the following technical effects:
(1)本发明通过对第一数据集中各三维形状的参考标签进行采样,生成对应的带有稀疏Scribble标注的三维形状,得到第三数据集;将第三数据集输入辅助分割网络模块以训练辅助分割网络;将第二数据集输入训练后的辅助分割网络模块,以输出包含完整标注的三维形状即伪标签,得到第五数据集;通过第一数据集与第五数据集训练主分割网络,以针对三维形状的每个面片输出对应的面片特征,并预测出对应的预测标签;通过自细化模块将辅助分割网络模块输出的伪标签与主分割网络输出的面片特征进行融合,得到融合后的伪标签;通过计算模块利用真实标签、融合后的伪标签以及主分割网络生成的预测标签计算交叉熵损失值,并根据交叉熵损失值调整主分割网络的网络参数;即本发明通过训练好的辅助分割网络为第二数据集中的稀疏标注生成对应的完整标签,从而扩增了用于训练主分割网络的训练数据,有效地避免了昂贵的全标注训练数据的成本;(1) The present invention samples the reference labels of each three-dimensional shape in the first data set to generate corresponding three-dimensional shapes with sparse Scribble annotations to obtain a third data set; inputs the third data set into the auxiliary segmentation network module to train the auxiliary segmentation network; inputs the trained second data set into the auxiliary segmentation network module to output three-dimensional shapes containing complete annotations, namely pseudo labels, to obtain a fifth data set; trains the main segmentation network through the first data set and the fifth data set to output corresponding patch features for each facet of the three-dimensional shape and predict corresponding predicted labels; fuses the pseudo labels output by the auxiliary segmentation network module with the facet features output by the main segmentation network through the self-refinement module to obtain fused pseudo labels; calculates the cross entropy loss value using the real label, the fused pseudo label and the predicted label generated by the main segmentation network through the calculation module, and adjusts the network parameters of the main segmentation network according to the cross entropy loss value; that is, the present invention generates corresponding complete labels for the sparse annotations in the second data set through the trained auxiliary segmentation network, thereby expanding the training data used to train the main segmentation network, and effectively avoiding the cost of expensive fully labeled training data;
(2)本发明通过自细化模块将辅助分割网络模块输出的伪标签与主分割网络输出的面片特征进行融合,以获取融合后的伪标签,并通过计算模块利用真实标签、融合后的伪标签以及主分割网络生成的预测标签计算交叉熵损失值,并根据交叉熵损失值调整主分割网络的网络参数,极大的提高了预测的精度。(2) The present invention fuses the pseudo-labels output by the auxiliary segmentation network module with the patch features output by the main segmentation network through the self-refinement module to obtain the fused pseudo-labels, and calculates the cross-entropy loss value using the real labels, the fused pseudo-labels and the predicted labels generated by the main segmentation network through the calculation module, and adjusts the network parameters of the main segmentation network according to the cross-entropy loss value, thereby greatly improving the prediction accuracy.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单的介绍。显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的实施例。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following briefly introduces the drawings required for use in the embodiments or the prior art descriptions. Obviously, the drawings described below are only some embodiments of the present invention, and for ordinary technicians in this field, other embodiments can be obtained based on these drawings without creative work.
图1是一种基于半监督的三维形状分割框架的构建方法的流程图;FIG1 is a flow chart of a method for constructing a semi-supervised 3D shape segmentation framework;
图2是一种基于半监督的三维形状分割框架的优化方法的流程图;FIG2 is a flow chart of an optimization method based on a semi-supervised 3D shape segmentation framework;
图3是一种基于半监督的三维形状分割方法的流程图;FIG3 is a flow chart of a semi-supervised three-dimensional shape segmentation method;
图4是具有稀疏Scribble标注的三维形状图;FIG4 is a three-dimensional shape graph with sparse Scribble annotations;
图5是具有完整标注的三维形状图;FIG5 is a three-dimensional shape diagram with complete annotations;
图6为辅助分割网络模块的结构图;FIG6 is a structural diagram of the auxiliary segmentation network module;
图7为主分割网络模块的结构图;FIG7 is a block diagram of a main segmentation network module;
图8为基于半监督的三维形状分割框架图;FIG8 is a diagram of a semi-supervised 3D shape segmentation framework;
图9是根据本申请实施例的一种电子设备的结构示意图。FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
具体实施方式Detailed ways
下面将参照附图更详细地描述本实施例的实施例。虽然附图中显示了本实施例的某些实施例,然而应当理解的是,本实施例可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本实施例。应当理解的是,本实施例的附图及实施例仅用于示例性作用,并非用于限制本实施例的保护范围。Embodiments of the present embodiment will be described in more detail below with reference to the accompanying drawings. Although certain embodiments of the present embodiment are shown in the accompanying drawings, it should be understood that the present embodiment can be implemented in various forms and should not be construed as being limited to the embodiments set forth herein, which are instead provided for a more thorough and complete understanding of the present embodiment. It should be understood that the drawings and embodiments of the present embodiment are only for exemplary purposes and are not intended to limit the scope of protection of the present embodiment.
为了更好地理解本申请实施例,以下将本申请实施例中涉及的技术术语解释如下:In order to better understand the embodiments of the present application, the technical terms involved in the embodiments of the present application are explained as follows:
半监督学习,是模式识别和机器学习领域研究的重点问题,是监督学习与无监督学习相结合的一种学习方法。半监督学习使用大量的未标记数据,以及同时使用标记数据,来进行模式识别工作。当使用半监督学习时,将会要求尽量少的人员来从事工作,同时,又能够带来比较高的准确性。Semi-supervised learning is a key issue in the field of pattern recognition and machine learning. It is a learning method that combines supervised learning with unsupervised learning. Semi-supervised learning uses a large amount of unlabeled data and labeled data to perform pattern recognition. When using semi-supervised learning, it will require as few people as possible to do the work, while at the same time achieving relatively high accuracy.
为了克服当前基于学习的分割方法的局限性,降低训练过程的复杂度,提高分割的精准度,如图1所示,本发明提出了一种基于半监督的三维形状分割框架的构建方法,如图8所示所述三维形状分割框架包括如图6所示的辅助分割网络模块、自细化模块与如图7所示的包含主分割网络与计算模块的主分割网络模块;所述辅助分割网络模块包括用于预测伪标签的辅助分割网络、用于对三维形状进行投影的投影模块与反投影模块;In order to overcome the limitations of the current learning-based segmentation method, reduce the complexity of the training process, and improve the accuracy of segmentation, as shown in FIG1, the present invention proposes a method for constructing a three-dimensional shape segmentation framework based on semi-supervision, as shown in FIG8, the three-dimensional shape segmentation framework includes an auxiliary segmentation network module as shown in FIG6, a self-refinement module and a main segmentation network module including a main segmentation network and a calculation module as shown in FIG7; the auxiliary segmentation network module includes an auxiliary segmentation network for predicting pseudo labels, a projection module for projecting three-dimensional shapes, and a back-projection module;
所述构建方法包括:The construction method comprises:
获取第一数据集与第二数据集;所述第一数据集中的数据为如图5所示的完整标注的三维形状;所述第二数据集中的数据为如图4所示的带有稀疏Scribble标注的三维形状;Acquire a first data set and a second data set; the data in the first data set is a completely annotated three-dimensional shape as shown in FIG5 ; the data in the second data set is a three-dimensional shape with sparse Scribble annotations as shown in FIG4 ;
需要解释的是,本实施例通过一些公开的基准数据集,包括PSB数据集、COSEG数据集和ShapeNetCore数据集来获取第一数据集与第二数据集。这些公开的数据集为每个三维形状提供了面片级的参考标签。另外需要说明的是,第一数据集中每个三维形状的标签为真实标签。It should be explained that this embodiment obtains the first data set and the second data set through some public benchmark data sets, including the PSB data set, the COSEG data set, and the ShapeNetCore data set. These public data sets provide a patch-level reference label for each three-dimensional shape. It should also be noted that the label of each three-dimensional shape in the first data set is a real label.
本实施例采用了不同的数据集划分策略(“1+2+2”、“2+1+2”)进行后续训练。具体而言,“1+2+2”表示在基准数据集中选取了20%完全标记的形状(成为第一数据集)和40%带有稀疏Scribble标记的形状(成为第二数据集)作为训练数据集,而剩余的40%形状则被保留为测试数据集。与此同时,“2+1+2”表示选用了基准数据集中40%完全标记形状(第一数据集)和20%带有稀疏Scribble标记的形状(第二数据集)作为训练数据集,同样,剩余的40%形状被保留为测试数据集。This embodiment adopts different data set division strategies ("1+2+2", "2+1+2") for subsequent training. Specifically, "1+2+2" means that 20% of the fully labeled shapes (becoming the first data set) and 40% of the shapes with sparse Scribble marks (becoming the second data set) in the benchmark data set are selected as the training data set, and the remaining 40% of the shapes are retained as the test data set. At the same time, "2+1+2" means that 40% of the fully labeled shapes (the first data set) and 20% of the shapes with sparse Scribble marks (the second data set) in the benchmark data set are selected as the training data set, and similarly, the remaining 40% of the shapes are retained as the test data set.
对第一数据集中的各三维形状按照预设比例对其参考标签进行采样,以生成对应的带有稀疏Scribble标注的三维形状,得到第三数据集;For each three-dimensional shape in the first data set, the reference label thereof is sampled according to a preset ratio to generate a corresponding three-dimensional shape with sparse Scribble annotations, thereby obtaining a third data set;
将第三数据集输入辅助分割网络模块以训练辅助分割网络,其中,所述投影模块获取第三数据集中各三维形状对应的多视图投影图像得到第四数据集,并建立参考矩阵;所述参考矩阵中的数据包括在获取多视图投影图像的过程中,每个像素对应在原始三维形状上的顶点坐标;所述辅助分割网络基于第一目标函数与第四数据集中的多视图投影图像进行训练,以预测出多视图投影图像中各二维图像对应的预测标签;所述反投影模块利用参考矩阵将预测标签映射回对应的三维形状的面片上,以对第三数据集中三维形状的各面片生成对应的完整标注;The third data set is input into the auxiliary segmentation network module to train the auxiliary segmentation network, wherein the projection module obtains the multi-view projection images corresponding to each three-dimensional shape in the third data set to obtain a fourth data set, and establishes a reference matrix; the data in the reference matrix includes the vertex coordinates of each pixel corresponding to the original three-dimensional shape in the process of obtaining the multi-view projection images; the auxiliary segmentation network is trained based on the first objective function and the multi-view projection images in the fourth data set to predict the predicted labels corresponding to each two-dimensional image in the multi-view projection images; the back-projection module uses the reference matrix to map the predicted labels back to the corresponding three-dimensional shape patches to generate corresponding complete annotations for each patch of the three-dimensional shape in the third data set;
具体地,本实施例对第三数据集中的每个三维形状执行了投影操作:在模型(三维形状)的球形包围盒的预设位置上,放置32个虚拟相机,每个相机以90度的间隔旋转四次,从而生成了总共128组深度图和渲染图,这些图像几乎覆盖了数据集中每个三维形状的所有顶点和面片。Specifically, this embodiment performs a projection operation on each three-dimensional shape in the third data set: 32 virtual cameras are placed at preset positions of the spherical bounding box of the model (three-dimensional shape), and each camera is rotated four times at intervals of 90 degrees, thereby generating a total of 128 sets of depth maps and rendering images, which cover almost all vertices and faces of each three-dimensional shape in the data set.
具体地,建立参考矩阵的方法为:根据相机位姿对三维形状进行渲染的过程中,通过使用额外的渲染缓冲记录每个像素对应在原始三维形状上的顶点坐标,并保存成一个参考矩阵。Specifically, the method for establishing the reference matrix is: in the process of rendering the three-dimensional shape according to the camera posture, the vertex coordinates of each pixel corresponding to the original three-dimensional shape are recorded by using an additional rendering buffer and saved as a reference matrix.
所述多视图投影图像的二维图像包括深度图与渲染图;The two-dimensional image of the multi-view projection image includes a depth map and a rendering image;
所述辅助分割网络包括:The auxiliary segmentation network comprises:
Encoder模块,用于接收多视图投影图像,并提取多视图投影图像中各二维图像对应的图像特征信息;The Encoder module is used to receive the multi-view projection image and extract the image feature information corresponding to each two-dimensional image in the multi-view projection image;
Decoder模块,用于根据图像特征信息对其对应的二维图像进行标签预测,得到各二维图像对应的预测标签。The Decoder module is used to predict the label of the corresponding two-dimensional image according to the image feature information to obtain the predicted label corresponding to each two-dimensional image.
需要说明的是,Encoder模块和Decoder模块是DeepLabv3+图像语义分割网络中的主要模块,本实施通过组合Encoder模块和Decoder模块为输入图像的每个像素产生语义标签。It should be noted that the Encoder module and the Decoder module are the main modules in the DeepLabv3+ image semantic segmentation network. This implementation generates a semantic label for each pixel of the input image by combining the Encoder module and the Decoder module.
所述第一目标函数的公式表达式为:The formula expression of the first objective function is:
式中,F表示第一数据集,s(f)表示第一数据集中三维形状的面片;x(f)表示s(f)面片的特征向量,y(f)表示辅助分割网络对x(f)的预测标签,θ表示辅助分割网络的网络参数,paux表示辅助分割网络对应的预测函数。Where F represents the first data set, s (f) represents the patch of the three-dimensional shape in the first data set; x (f) represents the feature vector of the patch s (f) , y (f) represents the predicted label of x (f) by the auxiliary segmentation network, θ represents the network parameters of the auxiliary segmentation network, and paux represents the prediction function corresponding to the auxiliary segmentation network.
所述反投影模块,具体用于根据每个像素在参考矩阵中记录的顶点坐标,寻找三维形状中对应坐标的面片,得到二维图像中预测标签与三维形状上每个面片的语义标签之间的对应关系,并基于对应关系将预测标签映射至对应的三维形状面片上。The back-projection module is specifically used to find the facets of corresponding coordinates in the three-dimensional shape according to the vertex coordinates recorded by each pixel in the reference matrix, obtain the correspondence between the predicted label in the two-dimensional image and the semantic label of each facet on the three-dimensional shape, and map the predicted label to the corresponding three-dimensional shape facet based on the correspondence.
将第二数据集输入训练后的辅助分割网络模块,以输出包含完整标注的三维形状即伪标签,得到第五数据集;Inputting the second data set into the trained auxiliary segmentation network module to output a three-dimensional shape containing complete annotations, i.e., a pseudo label, to obtain a fifth data set;
通过第一数据集与第五数据集训练主分割网络,以针对三维形状的每个面片输出对应的面片特征,并预测出对应的预测标签;The main segmentation network is trained by using the first data set and the fifth data set to output corresponding patch features for each patch of the three-dimensional shape and predict corresponding prediction labels;
所述主分割网络包括:The main segmentation network includes:
由四层全连接层组成的主要细分模块,用于提取三维形状中每个面片对应的面片特征;The main subdivision module consists of four fully connected layers to extract patch features corresponding to each patch in the 3D shape;
Softmax模块,用于根据每个面片对应的面片特征预测出对应的预测标签;Softmax module, used to predict the corresponding prediction label according to the patch features corresponding to each patch;
所述计算模块具体用于:The calculation module is specifically used for:
针对第一数据集,通过真实标签与Softmax模块生成的预测标签计算交叉熵损失值,针对第五数据集,通过融合后的伪标签与Softmax模块生成的预测标签计算交叉熵损失值。For the first data set, the cross entropy loss value is calculated by the real label and the predicted label generated by the Softmax module. For the fifth data set, the cross entropy loss value is calculated by the fused pseudo label and the predicted label generated by the Softmax module.
通过自细化模块将辅助分割网络模块输出的伪标签与主分割网络输出的面片特征进行融合,得到融合后的伪标签;The pseudo labels output by the auxiliary segmentation network module are fused with the patch features output by the main segmentation network through the self-refinement module to obtain the fused pseudo labels.
由于仅使用训练得到的辅助分割网络模块应用于第二数据集,得到的伪标签的准确性是非常有限的。因此,为了提高生成的完整标注数据集的精度,本发明引入了一个基于CNN卷积的自细化模块。该模块将主分割网络和辅助分割网络模块的预测结果融合在一起,以动态生成更准确的面片级完整标注数据,提高了主分割网络的预测精度。Since only the trained auxiliary segmentation network module is applied to the second data set, the accuracy of the pseudo-label obtained is very limited. Therefore, in order to improve the accuracy of the generated complete annotated data set, the present invention introduces a self-refinement module based on CNN convolution. This module fuses the prediction results of the main segmentation network and the auxiliary segmentation network modules to dynamically generate more accurate patch-level complete annotated data, thereby improving the prediction accuracy of the main segmentation network.
所述自细化模块包括两层卷积层,用于在预测过程中动态地学习主分割网络和辅助分割网络对带有稀疏Scribble标注的三维形状的预测结果的权重分配。例如,在训练的早期阶段,辅助分割网络的预测可能更准确,因此自细化模块会为辅助分割网络生成的预测标签分配更高的权重。然而,在训练的后期阶段,主分割网络的预测可能更准确,因此自细化模块会为主分割网络生成的预测标签分配更高的权重。通过这种动态调整的机制,自细化模块能够生成更精确的第二数据集的标签预测结果。The self-refinement module includes two convolutional layers, which are used to dynamically learn the weight allocation of the prediction results of the main segmentation network and the auxiliary segmentation network for the three-dimensional shape with sparse Scribble annotations during the prediction process. For example, in the early stages of training, the prediction of the auxiliary segmentation network may be more accurate, so the self-refinement module will assign higher weights to the predicted labels generated by the auxiliary segmentation network. However, in the later stages of training, the prediction of the main segmentation network may be more accurate, so the self-refinement module will assign higher weights to the predicted labels generated by the main segmentation network. Through this dynamic adjustment mechanism, the self-refinement module is able to generate more accurate label prediction results for the second data set.
通过计算模块利用真实标签、融合后的伪标签以及主分割网络生成的预测标签计算交叉熵损失值,并根据交叉熵损失值调整主分割网络的网络参数。The calculation module uses the true label, the fused pseudo label and the predicted label generated by the main segmentation network to calculate the cross entropy loss value, and adjusts the network parameters of the main segmentation network according to the cross entropy loss value.
本发明通过对第一数据集中各三维形状的参考标签进行采样,生成对应的带有稀疏Scribble标注的三维形状,得到第三数据集;将第三数据集输入辅助分割网络模块以训练辅助分割网络;将第二数据集输入训练后的辅助分割网络模块,以输出包含完整标注的三维形状即伪标签,得到第五数据集;通过第一数据集与第五数据集训练主分割网络,以针对三维形状的每个面片输出对应的面片特征,并预测出对应的预测标签;通过自细化模块将辅助分割网络模块输出的伪标签与主分割网络输出的面片特征进行融合,得到融合后的伪标签;通过计算模块利用真实标签、融合后的伪标签以及主分割网络生成的预测标签计算交叉熵损失值,并根据交叉熵损失值调整主分割网络的网络参数;即本发明通过训练好的辅助分割网络为第二数据集中的稀疏标注生成对应的完整标签,从而扩增了用于训练主分割网络的训练数据,有效地避免了昂贵的全标注训练数据的成本。The present invention samples the reference labels of each three-dimensional shape in the first data set to generate corresponding three-dimensional shapes with sparse Scribble annotations to obtain a third data set; inputs the third data set into an auxiliary segmentation network module to train the auxiliary segmentation network; inputs the second data set into the trained auxiliary segmentation network module to output a three-dimensional shape containing complete annotations, namely a pseudo label, to obtain a fifth data set; trains the main segmentation network through the first data set and the fifth data set to output corresponding patch features for each facet of the three-dimensional shape and predict corresponding predicted labels; fuses the pseudo labels output by the auxiliary segmentation network module with the facet features output by the main segmentation network through a self-refinement module to obtain fused pseudo labels; calculates a cross entropy loss value using the real label, the fused pseudo label and the predicted label generated by the main segmentation network through a calculation module, and adjusts the network parameters of the main segmentation network according to the cross entropy loss value; that is, the present invention generates corresponding complete labels for the sparse annotations in the second data set through the trained auxiliary segmentation network, thereby expanding the training data used to train the main segmentation network and effectively avoiding the cost of expensive fully labeled training data.
如图2所示,本发明实施例还提供了一种基于半监督的三维形状分割框架的优化方法,包括:As shown in FIG. 2 , an embodiment of the present invention further provides an optimization method based on a semi-supervised three-dimensional shape segmentation framework, comprising:
获取目标函数;Get the target function;
具体地,三维形状分割框架的目标函数基于:Specifically, the objective function of the 3D shape segmentation framework is based on:
训练辅助分割网络的目标函数、主分割网络在只使用第一数据集的情况下需要优化的目标函数,和在仅使用辅助分割网络模块和主分割网络模块的情况下三维形状分割框架的目标函数构建;其中:The objective function of the auxiliary segmentation network is trained, the objective function of the main segmentation network needs to be optimized when only the first data set is used, and the objective function of the three-dimensional shape segmentation framework is constructed when only the auxiliary segmentation network module and the main segmentation network module are used; wherein:
训练辅助分割网络的目标函数的公式表达式为:The objective function of training the auxiliary segmentation network is expressed as:
式中,F表示第一数据集,s(f)表示第一数据集中三维形状的面片;x(f)表示s(f)面片的特征向量,paux函数中的y(f)表示辅助分割网络对x(f)的预测标签,θ表示辅助分割网络的网络参数,paux表示辅助分割网络对应的预测函数;Wherein, F represents the first data set, s (f) represents the patch of the three-dimensional shape in the first data set; x (f) represents the feature vector of the patch s (f) , y (f) in the paux function represents the predicted label of the auxiliary segmentation network for x (f) , θ represents the network parameters of the auxiliary segmentation network, and paux represents the prediction function corresponding to the auxiliary segmentation network;
主分割网络在只使用第一数据集的情况下需要优化的目标函数的公式表达式为:The formula of the objective function that the main segmentation network needs to optimize when only the first data set is used is:
式中,表示主分割网络的网络参数,ppri表示主分割网络的预测,x(f)表示第一数据集F中三维形状的面片的特征向量,ppri函数中的y(f)表示主分割网络对x(f)的预测标签;In the formula, represents the network parameters of the main segmentation network, p pri represents the prediction of the main segmentation network, x (f) represents the feature vector of the patch of the three-dimensional shape in the first data set F, and y (f) in the p pri function represents the predicted label of the main segmentation network for x (f) ;
在仅使用辅助分割网络模块和主分割网络模块的情况下三维形状分割框架的目标函数的公式表达式为:The objective function of the 3D shape segmentation framework using only the auxiliary segmentation network module and the main segmentation network module is expressed as:
由此,三维形状分割框架的目标函数的公式表达式为:Therefore, the objective function of the 3D shape segmentation framework is expressed as:
式中,与φ均表示主分割网络的网络参数,λ表示自细化模块的网络参数,F表示第一数据集,ppri表示主分割网络的预测,x(f)表示第一数据集F中三维形状的面片的特征向量,y(f)表示主分割网络对x(f)的预测标签,S表示第二数据集或第三数据集,qconv表示自细化模块产生的动态权重,x(s)表示第二数据集或第三数据集中三维形状的面片的特征向量,s(s)表示第二数据集或第三数据集中三维形状的面片,y表示辅助分割网络对x(s)的预测,s(f)表示第一数据集中三维形状的面片。In the formula, and φ both represent the network parameters of the main segmentation network, λ represent the network parameters of the self-refinement module, F represents the first dataset, p pri represents the prediction of the main segmentation network, x (f) represents the feature vector of the 3D shape patch in the first dataset F, y (f) represents the predicted label of x (f) by the main segmentation network, S represents the second dataset or the third dataset, q conv represents the dynamic weight generated by the self-refinement module, x (s) represents the feature vector of the 3D shape patch in the second dataset or the third dataset, s (s) represents the 3D shape patch in the second dataset or the third dataset, y represents the prediction of x (s) by the auxiliary segmentation network, and s (f) represents the 3D shape patch in the first dataset.
通过目标函数优化三维形状分割框架;其中,所述三维形状分割框架通过上文所述的基于半监督的三维形状分割框架的构建方法得到。The three-dimensional shape segmentation framework is optimized through the objective function; wherein the three-dimensional shape segmentation framework is obtained through the construction method of the three-dimensional shape segmentation framework based on semi-supervision described above.
如图3所示,本发明实施例还提供了一种基于半监督的三维形状分割方法,包括:As shown in FIG3 , the embodiment of the present invention further provides a semi-supervised 3D shape segmentation method, comprising:
获取未标注的待分割三维形状;Obtain an unlabeled three-dimensional shape to be segmented;
将未标注的待分割三维形状输入主分割网络进行预测,得到包含完整标注的三维形状;其中,所述主分割网络通过上文所述的基于半监督的三维形状分割框架的构建方法训练得到。The unlabeled three-dimensional shape to be segmented is input into the main segmentation network for prediction to obtain a three-dimensional shape containing complete annotations; wherein the main segmentation network is trained by the construction method based on the semi-supervised three-dimensional shape segmentation framework described above.
本发明实施例还提供一种电子设备,包括:至少一个处理器;以及与至少一个处理器通信连接的存储器。上述存储器存储有能够被上述至少一个处理器执行的计算机程序,上述计算机程序在被上述至少一个处理器执行时用于使电子设备执行本发明实施例的方法。An embodiment of the present invention further provides an electronic device, comprising: at least one processor; and a memory in communication with the at least one processor. The memory stores a computer program executable by the at least one processor, and when the at least one processor executes the computer program, the electronic device executes the method of the embodiment of the present invention.
本发明实施例还提供一种存储有计算机程序的非瞬时机器可读介质,其中,上述计算机程序在被计算机的处理器执行时用于使上述计算机执行本发明实施例的方法。An embodiment of the present invention further provides a non-transitory machine-readable medium storing a computer program, wherein the computer program, when executed by a processor of a computer, is used to cause the computer to execute the method of the embodiment of the present invention.
本发明实施例还提供一种计算机程序产品,包括计算机程序,其中,计算机程序在被计算机的处理器执行时用于使计算机执行本发明实施例的方法。An embodiment of the present invention further provides a computer program product, including a computer program, wherein the computer program, when executed by a processor of a computer, is used to enable the computer to execute the method of the embodiment of the present invention.
参考图4,现将描述可以作为本发明实施例的服务器或客户端的电子设备的结构框图,其是可以应用于本发明的各方面的硬件设备的示例。电子设备旨在表示各种形式的数字电子的计算机设备,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本发明的实现。With reference to Figure 4, the structural block diagram of the electronic device that can be used as the server or client of an embodiment of the present invention will now be described, which is an example of a hardware device that can be applied to various aspects of the present invention. The electronic device is intended to represent various forms of digital electronic computer equipment, such as laptop computers, desktop computers, workbenches, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device can also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely examples, and are not intended to limit the implementation of the present invention described herein and/or required.
如图4所示,电子设备包括计算单元401,其可以根据存储在只读存储器(ROM)402中的计算机程序或者从存储单元408加载到随机访问存储器(RAM)403中的计算机程序,来执行各种适当的动作和处理。在RAM 403中,还可存储电子设备操作所需的各种程序和数据。计算单元401、ROM 402以及RAM 403通过总线404彼此相连。输入/输出(I/O)接口405也连接至总线404。As shown in Figure 4, the electronic device includes a computing unit 401, which can perform various appropriate actions and processes according to a computer program stored in a read-only memory (ROM) 402 or a computer program loaded from a storage unit 408 into a random access memory (RAM) 403. In RAM 403, various programs and data required for the operation of the electronic device can also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to the bus 404.
电子设备中的多个部件连接至I/O接口405,包括:输入单元406、输出单元407、存储单元408以及通信单元409。输入单元406可以是能向电子设备输入信息的任何类型的设备,输入单元406可以接收输入的数字或字符信息,以及产生与电子设备的用户设置和/或功能控制有关的键信号输入。输出单元407可以是能呈现信息的任何类型的设备,并且可以包括但不限于显示器、扬声器、视频/音频输出终端、振动器和/或打印机。存储单元408可以包括但不限于磁盘、光盘。通信单元409允许电子设备通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据,并且可以包括但不限于调制解调器、网卡、红外通信设备、无线通信收发机和/或芯片组,例如蓝牙设备、WiFi设备、WiMax设备、蜂窝通信设备和/或类似物。Multiple components in the electronic device are connected to the I/O interface 405, including: an input unit 406, an output unit 407, a storage unit 408, and a communication unit 409. The input unit 406 can be any type of device that can input information to the electronic device, and the input unit 406 can receive input digital or character information, and generate key signal input related to user settings and/or function control of the electronic device. The output unit 407 can be any type of device that can present information, and can include but is not limited to a display, a speaker, a video/audio output terminal, a vibrator, and/or a printer. The storage unit 408 can include but is not limited to a disk, an optical disk. The communication unit 409 allows the electronic device to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks, and can include but is not limited to a modem, a network card, an infrared communication device, a wireless communication transceiver, and/or a chipset, such as a Bluetooth device, a WiFi device, a WiMax device, a cellular communication device, and/or the like.
计算单元401可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元401的一些示例包括但不限于CPU、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元401执行上文所描述的各个方法和处理。例如,在一些实施例中,本发明的方法实施例可被实现为计算机程序,其被有形地包含于机器可读介质,例如存储单元408。在一些实施例中,计算机程序的部分或者全部可以经由ROM 402和/或通信单元409而被载入和/或安装到电子设备上。在一些实施例中,计算单元401可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行上述的方法。The computing unit 401 may be a variety of general and/or special processing components with processing and computing capabilities. Some examples of the computing unit 401 include, but are not limited to, a CPU, a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital signal processors (DSPs), and any appropriate processors, controllers, microcontrollers, etc. The computing unit 401 performs the various methods and processes described above. For example, in some embodiments, the method embodiments of the present invention may be implemented as a computer program, which is tangibly contained in a machine-readable medium, such as a storage unit 408. In some embodiments, part or all of the computer program may be loaded and/or installed on an electronic device via ROM 402 and/or a communication unit 409. In some embodiments, the computing unit 401 may be configured to perform the above-described method in any other appropriate manner (e.g., by means of firmware).
用于实施本发明实施例的方法的计算机程序可以采用一个或多个编程语言的任何组合来编写。这些计算机程序可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得计算机程序当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。计算机程序可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。The computer programs for implementing the methods of the embodiments of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor or controller of a general-purpose computer, a special-purpose computer, or other programmable data processing device, so that when the computer programs are executed by the processor or controller, the functions/operations specified in the flow chart and/or block diagram are implemented. The computer programs may be executed entirely on the machine, partially on the machine, partially on the machine and partially on a remote machine as a stand-alone software package, or entirely on a remote machine or server.
在本发明实施例的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读信号介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of an embodiment of the present invention, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, device, or equipment. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable signal medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or equipment, or any suitable combination of the foregoing. A more specific example of a machine-readable storage medium may include an electrical connection based on one or more lines, a portable computer disk, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
需要说明的是,本发明实施例使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。本发明实施例中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。It should be noted that the term "including" and its variations used in the embodiments of the present invention are open inclusions, that is, "including but not limited to". The term "based on" means "based at least in part on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one other embodiment"; the term "some embodiments" means "at least some embodiments". The modifications of "one" and "multiple" mentioned in the embodiments of the present invention are illustrative rather than restrictive. Those skilled in the art should understand that unless the context clearly indicates otherwise, it should be understood as "one or more".
本发明实施例所涉及的用户信息(包括但不限于用户设备信息、用户个人信息等)和数据(包括但不限于用于分析的数据、存储的数据、展示的数据等),均为经用户授权或者经过各方充分授权的信息和数据,并且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准,并提供有相应的操作入口,供用户选择授权或者拒绝。The user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in the embodiments of the present invention are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with relevant laws, regulations and standards of relevant countries and regions, and provide corresponding operation entrances for users to choose to authorize or refuse.
本发明实施例所提供的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本发明的保护范围在此方面不受限制。The various steps described in the method implementation methods provided in the embodiments of the present invention may be performed in different orders and/or in parallel. In addition, the method implementation methods may include additional steps and/or omit the steps shown. The scope of protection of the present invention is not limited in this respect.
“实施例”一词在本说明书中指的是结合实施例描述的具体特征、结构或特性可以包括在本发明的至少一个实施例中。该短语出现在说明书中的各个位置并不一定意味着相同的实施例,也不意味着与其它实施例相互排斥而具有独立性或可供选择。本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见。尤其,对于装置、设备、系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明。The term "embodiment" in this specification refers to specific features, structures or characteristics described in conjunction with the embodiment that can be included in at least one embodiment of the present invention. The appearance of this phrase in various places in the specification does not necessarily mean the same embodiment, nor does it mean that it is mutually exclusive with other embodiments and is independent or optional. The various embodiments in this specification are described in a related manner, and the same or similar parts between the various embodiments refer to each other. In particular, for the device, equipment, and system embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and the relevant parts refer to the partial description of the method embodiment.
以上所述实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对专利保护范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明的保护范围应以所附权利要求为准。The above-mentioned embodiments only express several implementation methods of the present invention, and the description thereof is relatively specific and detailed, but it cannot be understood as limiting the scope of patent protection. It should be pointed out that, for a person of ordinary skill in the art, several variations and improvements can be made without departing from the concept of the present invention, and these all belong to the scope of protection of the present invention. Therefore, the scope of protection of the present invention shall be subject to the attached claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311667459.2A CN117934488A (en) | 2023-12-06 | 2023-12-06 | Construction and optimization method of three-dimensional shape segmentation framework based on semi-supervision and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311667459.2A CN117934488A (en) | 2023-12-06 | 2023-12-06 | Construction and optimization method of three-dimensional shape segmentation framework based on semi-supervision and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117934488A true CN117934488A (en) | 2024-04-26 |
Family
ID=90758125
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311667459.2A Pending CN117934488A (en) | 2023-12-06 | 2023-12-06 | Construction and optimization method of three-dimensional shape segmentation framework based on semi-supervision and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117934488A (en) |
-
2023
- 2023-12-06 CN CN202311667459.2A patent/CN117934488A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7373554B2 (en) | Cross-domain image transformation | |
CN109934173B (en) | Expression recognition method and device and electronic equipment | |
CN110675487B (en) | Three-dimensional face modeling and recognition method and device based on multi-angle two-dimensional face | |
WO2020119458A1 (en) | Facial landmark detection method and apparatus, computer device and storage medium | |
WO2021164550A1 (en) | Image classification method and apparatus | |
CN108961369A (en) | The method and apparatus for generating 3D animation | |
WO2020211573A1 (en) | Method and device for processing image | |
CN113763249A (en) | Text image super-resolution reconstruction method and related equipment | |
WO2024098685A1 (en) | Face driving method and apparatus for virtual character, and terminal device and readable storage medium | |
CN105096353A (en) | Image processing method and device | |
WO2023020358A1 (en) | Facial image processing method and apparatus, method and apparatus for training facial image processing model, and device, storage medium and program product | |
CN113822965A (en) | Image rendering processing method, device and equipment and computer storage medium | |
CN117095006B (en) | Image aesthetic evaluation method, device, electronic equipment and storage medium | |
CN116385827A (en) | Parameterized face reconstruction model training method and key point tag data generation method | |
CN115346262A (en) | Method, device and equipment for determining expression driving parameters and storage medium | |
JP2023545052A (en) | Image processing model training method and device, image processing method and device, electronic equipment, and computer program | |
CN114708374A (en) | Virtual image generation method, device, electronic device and storage medium | |
WO2024245063A1 (en) | Data processing method and apparatus | |
CN117830580A (en) | Image generation method, device, electronic device and storage medium | |
US20230360327A1 (en) | Generating three-dimensional representations for digital objects utilizing mesh-based thin volumes | |
CN114529649A (en) | Image processing method and device | |
CN117934488A (en) | Construction and optimization method of three-dimensional shape segmentation framework based on semi-supervision and electronic equipment | |
CN116894911A (en) | Three-dimensional reconstruction method, device, electronic equipment and readable storage medium | |
US11954779B2 (en) | Animation generation method for tracking facial expression and neural network training method thereof | |
CN110991229A (en) | Three-dimensional object identification method based on DSP chip and quantitative model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |