CN110457515B - 3D Model Retrieval Method Based on Multi-View Neural Network Based on Global Feature Capture Aggregation - Google Patents

3D Model Retrieval Method Based on Multi-View Neural Network Based on Global Feature Capture Aggregation Download PDF

Info

Publication number
CN110457515B
CN110457515B CN201910653415.1A CN201910653415A CN110457515B CN 110457515 B CN110457515 B CN 110457515B CN 201910653415 A CN201910653415 A CN 201910653415A CN 110457515 B CN110457515 B CN 110457515B
Authority
CN
China
Prior art keywords
model
dimensional
view
neural network
dimensional model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910653415.1A
Other languages
Chinese (zh)
Other versions
CN110457515A (en
Inventor
高赞
李荫民
张桦
徐光平
薛彦兵
王志岗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University of Technology
Original Assignee
Tianjin University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University of Technology filed Critical Tianjin University of Technology
Priority to CN201910653415.1A priority Critical patent/CN110457515B/en
Publication of CN110457515A publication Critical patent/CN110457515A/en
Application granted granted Critical
Publication of CN110457515B publication Critical patent/CN110457515B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5854Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

一种基于全局特征捕捉聚合的多视角神经网络的三维模型检索方法,实现了对多视角图像间内在联系的挖掘,同时实现了对多视角图像的有效聚合,得到了拥有紧凑的高区分度的三维模型形状特征描述符,具体包含以下步骤:(1)模型多视图表示,(2)设计网络模型,(3)生成最难样本对,(4)训练网络模型,(5)提取深度特征,(6)进行模型检索;本发明通过使用非局部思想挖掘多视图网络之间的联系,同时通过加权局部聚合层将多视图特征融合,从而得到一个高区分度且紧凑的三维模型描述符。

Figure 201910653415

A 3D model retrieval method based on multi-view neural network based on global feature capture and aggregation, which realizes the mining of the intrinsic relationship between multi-view images, and at the same time realizes the effective aggregation of multi-view images. The 3D model shape feature descriptor includes the following steps: (1) multi-view representation of the model, (2) designing the network model, (3) generating the most difficult sample pairs, (4) training the network model, (5) extracting deep features, (6) Model retrieval; the present invention mines the connections between multi-view networks by using non-local ideas, and at the same time fuses multi-view features through a weighted local aggregation layer, thereby obtaining a highly discriminative and compact 3D model descriptor.

Figure 201910653415

Description

Three-dimensional model retrieval method of multi-view neural network based on global feature capture aggregation
Technical Field
The invention belongs to the field of computer vision and deep learning, and discloses a three-dimensional model retrieval method based on a multi-view neural network of global feature capture and aggregation to mine the internal relation between multi-view images representing a three-dimensional model, so that the performance of three-dimensional model retrieval is improved.
Background
With the great improvement of computer performance, a large number of three-dimensional models are generated in the fields of three-dimensional medicine, virtual reality and three-dimensional games, and how to identify and search the three-dimensional models becomes a research direction which is concerned by people in the current field of computer vision. Model representation methods in three-dimensional model retrieval can be divided into two types: 1) model-based representation methods, such as grid-or voxel-based discrete representation, but also point cloud-based representation methods. The feature design based on model representation is mostly based on the shape of the model itself and its geometrical properties, such as a three-dimensional histogram of hand design, a feature bag constructed by surface curvature and normal. 2) Based on a multi-view representation method, a three-dimensional model is represented using two-dimensional images acquired from different views. The representation method based on the two-dimensional image also has various manual design features, such as directional gradient histograms, zernike moments, SIFT features and the like.
However, the conventional manual design features are not good in search performance, and because the features obtained from different design algorithms are different in emphasis due to manual design, the model features cannot be comprehensively represented. With the wide application of deep learning techniques in the field of computer vision, such as classical AlexNet and google lenet deep convolutional neural networks. The data is automatically learned and fitted with the image features through the neural network, and compared with the manually designed features, the data can learn more comprehensive features, so that the image recognition effect is greatly improved. In multi-view-based three-dimensional model retrieval, each three-dimensional model has a plurality of view image representations, but the existing deep neural network is mainly used for identifying a single image, and the identification effect is limited by incompleteness of information. How to aggregate multi-view image information and how to capture the spatial characteristics of the model is the key to improve the retrieval performance of the three-dimensional model.
The method has the advantages that the multi-view image features cannot be simply and directly spliced by aggregating the multi-view information and capturing the model space features, and the splicing has various defects, so that the feature dimension is multiplied to increase, the retrieval time is increased, and the space features cannot be effectively captured by simple splicing, and the retrieval performance is not obviously improved.
Disclosure of Invention
The invention aims to solve the problems that multi-view features cannot be effectively aggregated and model space information is lost in the conventional method, and provides a three-dimensional model retrieval method of a multi-view neural network based on global feature capture aggregation.
The method is used for mining the internal relation among the multi-view images of the three-dimensional model, capturing the spatial information of the three-dimensional model and simultaneously improving the retrieval speed by fusing the multi-view features. The invention is specifically verified in three-dimensional model retrieval.
The invention provides a multi-view neural network three-dimensional model retrieval method based on global feature capture aggregation, which comprises the following steps:
1 st, multiview representation of three-dimensional models
The invention carries out the retrieval of the three-dimensional model based on the multi-view representation of the three-dimensional model, sets the view angle through processing software after obtaining the three-dimensional model data, and captures the view image of the corresponding view angle of the three-dimensional model.
2 nd, designing a network model
And designing a special double-chain deep neural network model according to the characteristics of three-dimensional model retrieval, and using the special double-chain deep neural network model for training and learning the characteristic representation suitable for the three-dimensional model. The double-chain deep neural network model comprises 5 parts, namely a low-dimensional convolution module, a non-local module, a high-dimensional convolution module, a weighted local aggregation layer and a classification layer. Meanwhile, a fusion loss function based on the central loss and the paired boundary loss is designed to increase the distinguishability between different types of three-dimensional models.
3, generating the most difficult sample pairs
The use of the double-chain deep neural network model requires the input in the form of sample pairs, and if all samples are paired, the number of generated sample pairs is extremely large. And the most difficult sample pairs are generated according to the principle that the samples in the same type are farthest away and the samples in different types are nearest.
4, training network model
And training a double-chain deep neural network by using a three-dimensional model training set, wherein the double-chain deep neural network learns the network model parameters capable of comprehensively representing the training data by an objective function.
5, extracting depth features
In the retrieval process, each three-dimensional model uses feature representation, and the invention uses the network model parameters trained in the step 4 to extract features. The network model is input into a plurality of view images representing a single three-dimensional model, and the features of the plurality of view images are aggregated into a three-dimensional model feature descriptor with high discrimination degree through feature extraction and aggregation of a double-chain deep neural network.
6 th, performing three-dimensional model search
Given a three-dimensional model, we want to find a three-dimensional model that is of the same kind as the three-dimensional model in the target dataset, i.e. a related three-dimensional model. The feature description and distance measurement method in the three-dimensional model retrieval is very important. The feature description uses the depth features extracted in the step 5, and the distance measurement method uses the Euclidean distance formula, and the calculation process is as follows.
Figure BDA0002136054290000031
x and y respectively represent three-dimensional models, wherein d (x, y) represents the distance between the two three-dimensional models, and xi,yiRespectively representing the i-dimensional features of x and the i-dimensional features of y.
The advantages and beneficial effects of the invention;
1) for multi-view images, a non-local module is used to mine potential relevance between the various view images.
2) And aggregating the captured high-dimensional features of each view by using a weighted local aggregation layer to obtain a high-discrimination three-dimensional model feature descriptor.
3) Through the two improvements, the invention achieves advanced performance in the three-dimensional model search, and the search result is shown in fig. 5.
Drawings
FIG. 1 is a double-chain deep neural network structure designed by the present invention.
FIG. 2 is a retrieval flow diagram of the present invention.
FIG. 3 is an example of a three-dimensional model data set.
FIG. 4 is a multi-perspective image of a three-dimensional model.
FIG. 5 is a comparison of the search performance results of the present invention method with the current advanced method on a ModelNet40 dataset. The corresponding documents of fig. 5 are as follows.
[1]You H,FengY,Ji R,et al.PVNet:A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition[J].acm multimedia,2018:1310-1318.
[2]He X,ZhouY,Zhou Z,et al.Triplet-Center Loss for Multi-view 3D Object Retrieval[J].computer vision andpattern recognition,2018:1945-1954.
[3]Yavartanoo M,Kim EY,Lee K M,et al.SPNet:Deep 3D Object Classification and Retrieval using Stereographic Projection.[J].arXiv:Computer Vision and Pattern Recognition,2018.
[4]Feng Y,Zhang Z,Zhao X,et al.GVCNN:Group-View Convolutional Neural Networks for 3D Shape Recognition[C].computer vision andpattern recognition,2018:264-272.
[5]Su H,Maji S,Kalogerakis E,et al.Multi-view Convolutional Neural Networks for 3D Shape Recognition[J].international conference on computer vision,2015:945-953.
[6]Bai S,Bai X,Zhou Z,et al.GIFT:A Real-Time and Scalable 3D Shape Search Engine[J].computer vision and pattern recognition,2016:5023-5032.
[7]Shi B,Bai S,Zhou Z,et al.DeepPano:Deep Panoramic Representation for 3-D Shape Recognition[J].IEEE Signal Processing Letters,2015,22(12):2339-2343.
[8]Sinha A,Bai J,Ramani K,et al.Deep Learning 3D Shape Surfaces Using Geometry Images[C].european conference on computer vision,2016:223-240.
[9]Wu Z,Song S,KhoslaA,et al.3D ShapeNets:A deep representation for volumetric shapes[J].computer vision and pattern recognition,2015:1912-1920.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Example 1:
fig. 2 is a flowchart illustrating steps of a three-dimensional model retrieval method based on a multi-view neural network with global feature capture aggregation according to the present invention, and the specific operation steps are as follows.
Step one Multi-View representation of three-dimensional model
Three-dimensional models in the fields of medicine, games, industrial design, and the like are all represented as a single three-dimensional model, and are usually stored as a polygonal mesh, which is a collection of points connected to edges forming a surface. Therefore, in the method proposed by the present invention, the three-dimensional model needs to be represented by multi-view images, and to create a multi-view shape representation, we need to set up a view (virtual camera) to render each mesh. We created 12 rendered view images by placing 12 virtual cameras every 30 degrees around the grid, as shown in fig. 4.
Step two, designing a network model
According to the characteristics of three-dimensional model retrieval, a special double-chain deep neural network model is designed for training and learning the characteristic representation suitable for the three-dimensional model. The double-chain deep neural network model comprises 5 parts, namely a low-dimensional convolution module, a non-local module, a high-dimensional convolution module, a weighted local aggregation layer and a classification layer. Meanwhile, a fusion loss function based on the center loss and the paired boundary loss is designed to increase the distinguishability between different types of three-dimensional models.
The low-dimensional convolution module contains a convolution layer with convolution kernel of 7x7 and step size of 2, followed by a max pooling layer with kernel of 3x3 and step size of 2. The module is used to capture low-dimensional features of the extracted view. In the non-local module, the non-local module is used for mining the relation between the views, and a graph structure is constructed through the non-local idea to connect different view images. The formula of the non-local module is shown as follows.
Figure BDA0002136054290000041
Is provided with a view set V, VaIs the a view in V, VbIs the b view in V, yaTo correspond to vaTo output of (c). Become intoThe function g is used to calculate the correlation between the two views, the univariate function h is used to scale the input, and the function U is used for normalization.
For the convenience of convolution operation, the formula of the pairwise function g is as follows:
g(va,vb)=α(va)Tβ(vb) (3)
wherein α (v)a)=Wαva,β(va)=Wβvb,Wα,WβIs a weight matrix that can be learned. The normalization factor u (x) is N, where N is the number of views included in the view set V.
The unary function h is a linear function:
h(mb)=Whmb (4)
Whfor convolutional layer network parameters, a 1x1 convolution operation is used in the implementation. The non-local module is as follows:
za=Wzya+ma (5)
wherein Wzya+maDenotes a residual connection, yaCalculated from formula (2) to obtain maAs an original input, zaIs the final output of the non-local module. The module can be conveniently inserted into the existing network model through the implementation form, and the original model does not need to be adjusted to adapt to the existing network model.
A high-dimensional convolution module (four residual convolution modules) is added after the non-local modules to capture the high-level abstract features of the view. The first module comprises 6 convolutional layers with 3x3 cores, the second module comprises 8 convolutional layers with 3x3 cores, the third module comprises 12 convolutional layers with 3x3 cores, and the fourth module comprises 6 convolutional layers with 3x3 cores, wherein residual operation is performed on every two convolutional layers, and gradient explosion is effectively prevented through the residual operation. After high-level abstract features of the multiple views are extracted, a weighted local aggregation layer is used for self-learning the differentiated weight of the multiple views, a virtual class center thought is used for classifying the multiple view pictures, the virtual classes participate in the classification process, but the virtual classes are directly abandoned in the return process, so that the contribution degree of the pictures with low distinctiveness is reduced. The input of the layer is vector characteristics corresponding to each multi-view, and the output of the layer is class center residual error vector characteristics for removing virtual classes. And finally, aggregating the model descriptors into a compact model descriptor with high discrimination degree through a Max-posing operation. The classification layer uses Softmax classification.
A double-chain network is designed on the basis of single chains, and meanwhile, a plurality of loss functions are fused to learn more distinctive feature descriptors, namely a paired boundary loss function and a central loss function.
The pairwise boundary loss function is shown below:
Lb(xi,xj)=(1-yij)(α-dij)+yij[dij-(α-m)] (6)
xixja pair of samples, dijIs the Euclidean distance, yijIs the relevant weight. Where α is the boundary, m is the distance between the farthest positive sample and the nearest negative sample, Lb(xi,xj) The loss function value is obtained.
The center loss function is as follows:
Figure BDA0002136054290000061
wherein xiAs a sample feature, kciIs a class center of class i, LcThe corresponding central loss function value. The distance between the sample feature and the center of the corresponding class is calculated, and the corresponding loss function value is smaller when the distance is smaller, so that the effect of reducing the distance in the class is achieved, and the more distinctive feature descriptor is obtained.
By fusing the loss functions, the intra-class distance is reduced, the inter-class distance is increased, and the feature descriptors with higher distinctiveness can be independently learned.
Step three most difficult sample pair generation
During the process of generating the sample pairsIf there are p objects in each class, then generating the most difficult positive sample pairs, and selecting the k objects with the farthest distance in each class for each object. The positive sample pairs have a total of c · k · p sample pairs. When the positive and negative sample pairs are generated, each object needs to be matched with one object which is the nearest in other classes to be used as a nearest neighbor different class sample, and the total c.p2And (4) sample pairs.
Step four training network model
And training a double-chain deep neural network by using a three-dimensional model training set, wherein the double-chain deep neural network learns the network model parameters capable of comprehensively representing the training data by an objective function.
The invention can train the network model by using a Pythrch deep learning framework, and firstly, data preprocessing operation including data normalization, original image size unification, random cutting, image horizontal random inversion and image vertical inversion is required to be carried out on input data. The data normalization is used for normalizing the raw data to the statistical distribution on a fixed interval so as to ensure that the program convergence is accelerated. The reason why the original image size is unified is that the size of the network model is fixed after the network model is designed, and therefore the size of the input image is consistent with the size required by the network model. Random cropping, image flipping horizontally, and vertical inversion are to increase the amount of data to prevent the network model from overfitting. In the initial parameter setting, we set the iteration round number to 8, 20 sample pairs per iteration, where the initial learning rate is set to 0.0001, and the pre-trained network model parameters use network model parameters pre-trained on the large dataset ImageNet. Where the parameter a is set to 1.2 and m is set to 0.4. The present invention uses an adaptive gradient optimizer that can adaptively adjust the learning rate for different parameters. The step update range is calculated by comprehensively considering the first moment estimate and the second moment estimate of the gradient.
Step five, extracting depth features
In the retrieval process, each three-dimensional model uses characteristic representation, the invention uses a double-chain structure to train the network when training the network to obtain a trained neural network model, and then uses the network model parameters trained in the step 4 as the model parameters for extracting the characteristics. After the features are extracted, a single-chain network model parameter is obtained. The network model is input into a plurality of view images representing a single three-dimensional model, feature extraction and aggregation are carried out through the network structure provided by the invention, and the features of the plurality of view images are aggregated into the features of the single three-dimensional model. In the process of extracting the features, the dimension of the features extracted by the method is 512.
Step six three-dimensional model retrieval
Given a three-dimensional model, a three-dimensional model which belongs to the same kind as the three-dimensional model, namely a related three-dimensional model, is found in a target data set, a retrieval three-dimensional model set is set to be Q, a data set to be queried is set to be G, and the target is to find the three-dimensional model related to the three-dimensional model in Q in G. The realization form is to calculate the same three-dimensional model QiThe relevance of each three-dimensional model in the data set G is sorted according to the relevance size to obtain the relevance of the three-dimensional model QiA related three-dimensional model. The specific implementation form is shown as follows.
The three-dimensional model set and the data set to be inquired are retrieved by using the characteristic vector representation, and the invention uses the 5 th step to extract the characteristics. After the characteristic representation of each three-dimensional model in the retrieval data set and the data set to be inquired is obtained, the three-dimensional model Q is calculatediThe distance from each three-dimensional model in the data set G to be queried is expressed in the following form.
Figure BDA0002136054290000071
LijAs a three-dimensional model Qi,GjWherein f (Q)i,Gj) For the distance measurement method between two three-dimensional models, the distance measurement method of the invention uses Euclidean distance, and the calculation process is as follows:
Figure BDA0002136054290000072
wherein x and y represent different three-dimensional models, respectively, and d (x, y) represents twoDistance, x, between three-dimensional modelsi,yiRespectively representing the i-dimensional features of x and the i-dimensional features of y. Calculating to obtain QiAfter the distance from each three-dimensional model in G, the distances are sorted, and the first k distances can be taken as the same QiA related three-dimensional model. The results of the sequential calculations to obtain the three-dimensional model in G that is correlated to the three-dimensional model in Q are shown in fig. 5 for the modelnet40 dataset.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (1)

1.基于全局特征捕捉聚合的多视角神经网络的三维模型检索方法,用于挖掘三维模型的多视图之间的潜在内在联系,以提高三维模型检索的性能,其特征在于,该方法的步骤包括:1. A three-dimensional model retrieval method based on a multi-view neural network that captures and aggregates global features, and is used to mine the potential internal connection between multiple views of a three-dimensional model to improve the performance of three-dimensional model retrieval, and it is characterized in that, the steps of the method include: : 第1、三维模型的多视图表示1. Multi-view representation of 3D models 该方法是基于三维模型的多视角表示来进行三维模型的检索,在得到三维模型数据之后,通过处理软件设置好视角角度,捕捉三维模型的对应视角角度视图图像;The method retrieves the three-dimensional model based on the multi-view representation of the three-dimensional model. After obtaining the three-dimensional model data, the viewing angle is set by the processing software, and the corresponding viewing angle view image of the three-dimensional model is captured; 第2、设计网络模型2. Design the network model 根据三维模型检索的特点设计了专属的双链深度神经网络模型,用于训练学习适用于三维模型的特征表示;该双链深度神经网络模型包含5部分,分别为低维卷积层,非局部模块,高维卷积层,加权局部聚合层和分类层;同时设计了基于中心损失和成对边界损失的融合损失函数来增加不同类别三维模型之间的区分性;According to the characteristics of 3D model retrieval, a dedicated double-chain deep neural network model is designed for training and learning the feature representation suitable for 3D models; the double-chain deep neural network model consists of 5 parts, which are low-dimensional convolution layer, non-local module, high-dimensional convolution layer, weighted local aggregation layer and classification layer; meanwhile, a fusion loss function based on center loss and pairwise boundary loss is designed to increase the discrimination between different categories of 3D models; 第3、生成最难样本对3. Generate the most difficult sample pair 使用双链深度神经网络模型需要以样本对的形式作为输入,如果对所有样本都进行配对,那生成的样本对数目将极为庞大,而最难样本对是根据同类样本距离最远,不同类样本距离最近的原则来生成;Using a double-chain deep neural network model requires the form of sample pairs as input. If all samples are paired, the number of generated sample pairs will be extremely large, and the most difficult sample pair is based on the distance between the samples of the same type and the samples of different types. Generated by the principle of the closest distance; 第4、训练网络模型4. Train the network model 使用三维模型训练集来训练双链深度神经网络,双链深度神经网络通过目标函数来自我学习到能全面表示训练数据的网络模型参数;Use the three-dimensional model training set to train the double-chain deep neural network, and the double-chain deep neural network learns the network model parameters that can fully represent the training data through the objective function; 第5、提取深度特征5. Extract deep features 在检索过程中,每个三维模型都使用特征表示,该方法使用第4步中训练好的网络模型参数来提取特征,网络模型输入为代表单个三维模型的多张视图图像,通过双链深度神经网络的特征提取与聚合,将多张视图图像特征聚合为单个三维模型特征;In the retrieval process, each 3D model uses a feature representation. This method uses the network model parameters trained in step 4 to extract features. The network model input is multiple view images representing a single 3D model, and the network model is input through a double-chain deep neural network. Feature extraction and aggregation of the network, which aggregates multiple view image features into a single 3D model feature; 第6、进行三维模型检索6. 3D model retrieval 给定一个三维模型,我们要在目标数据集中找到同该三维模型属于同类的三维模型即相关三维模型;三维模型检索中特征描述和距离度量方法是非常重要的,特征描述使用第5步提取的深度特征,距离度量方法使用欧氏距离公式,计算过程如下:Given a 3D model, we need to find the 3D model that belongs to the same kind of 3D model in the target dataset, that is, the related 3D model; the feature description and distance measurement methods are very important in the 3D model retrieval, and the feature description is extracted in step 5. For depth features, the distance measurement method uses the Euclidean distance formula. The calculation process is as follows:
Figure FDA0002136054280000011
Figure FDA0002136054280000011
x,y分别代表不同的三维模型,其中d(x,y)表示两个三维模型间的距离,xi,yi分别表示x的i维特征及y的i维特征。x and y represent different three-dimensional models, wherein d(x, y) represents the distance between two three-dimensional models, and x i and y i represent the i-dimensional feature of x and the i-dimensional feature of y, respectively.
CN201910653415.1A 2019-07-19 2019-07-19 3D Model Retrieval Method Based on Multi-View Neural Network Based on Global Feature Capture Aggregation Active CN110457515B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910653415.1A CN110457515B (en) 2019-07-19 2019-07-19 3D Model Retrieval Method Based on Multi-View Neural Network Based on Global Feature Capture Aggregation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910653415.1A CN110457515B (en) 2019-07-19 2019-07-19 3D Model Retrieval Method Based on Multi-View Neural Network Based on Global Feature Capture Aggregation

Publications (2)

Publication Number Publication Date
CN110457515A CN110457515A (en) 2019-11-15
CN110457515B true CN110457515B (en) 2021-08-24

Family

ID=68481530

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910653415.1A Active CN110457515B (en) 2019-07-19 2019-07-19 3D Model Retrieval Method Based on Multi-View Neural Network Based on Global Feature Capture Aggregation

Country Status (1)

Country Link
CN (1) CN110457515B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111078913A (en) * 2019-12-16 2020-04-28 天津运泰科技有限公司 Three-dimensional model retrieval method based on multi-view convolution neural network
CN111382300B (en) * 2020-02-11 2023-06-06 山东师范大学 Multi-view three-dimensional model retrieval method and system based on pairing depth feature learning
CN111340866B (en) 2020-02-26 2024-03-01 腾讯科技(深圳)有限公司 Depth image generation method, device and storage medium
CN111598130A (en) * 2020-04-08 2020-08-28 天津大学 Identification method of traditional Chinese medicine based on multi-view convolutional neural network
CN111513709B (en) * 2020-05-20 2021-08-24 浙江大学 Nonlocal neural network myocardial transmembrane potential reconstruction method based on iterative contraction threshold algorithm
CN111914697A (en) * 2020-07-16 2020-11-10 天津大学 Multi-view target identification method based on view semantic information and sequence context information
CN113869120B (en) * 2021-08-26 2022-08-05 西北大学 A View Filtering-Based Classification Method for Aggregated Convolutional 3D Models
CN114238676A (en) * 2021-12-22 2022-03-25 芯勍(上海)智能化科技股份有限公司 MBD model retrieval method and device based on graph neural network
CN116310425B (en) * 2023-05-24 2023-09-26 山东大学 Fine-grained image retrieval method, system, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268592A (en) * 2014-09-22 2015-01-07 天津理工大学 Multi-view combined movement dictionary learning method based on collaboration expression and judgment criterion
CN106204467A (en) * 2016-06-27 2016-12-07 深圳市未来媒体技术研究院 A kind of image de-noising method based on cascade residual error neutral net
CN106682233A (en) * 2017-01-16 2017-05-17 华侨大学 Method for Hash image retrieval based on deep learning and local feature fusion
CN106844620A (en) * 2017-01-19 2017-06-13 天津大学 A kind of characteristic matching method for searching three-dimension model based on view
WO2017156243A1 (en) * 2016-03-11 2017-09-14 Siemens Aktiengesellschaft Deep-learning based feature mining for 2.5d sensing image search
KR101854048B1 (en) * 2016-11-25 2018-05-02 연세대학교 산학협력단 Method and device for measuring confidence of depth map by stereo matching
CN108492364A (en) * 2018-03-27 2018-09-04 百度在线网络技术(北京)有限公司 The method and apparatus for generating model for generating image
CN109213884A (en) * 2018-11-26 2019-01-15 北方民族大学 A kind of cross-module state search method based on Sketch Searching threedimensional model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104268592A (en) * 2014-09-22 2015-01-07 天津理工大学 Multi-view combined movement dictionary learning method based on collaboration expression and judgment criterion
WO2017156243A1 (en) * 2016-03-11 2017-09-14 Siemens Aktiengesellschaft Deep-learning based feature mining for 2.5d sensing image search
CN106204467A (en) * 2016-06-27 2016-12-07 深圳市未来媒体技术研究院 A kind of image de-noising method based on cascade residual error neutral net
KR101854048B1 (en) * 2016-11-25 2018-05-02 연세대학교 산학협력단 Method and device for measuring confidence of depth map by stereo matching
CN106682233A (en) * 2017-01-16 2017-05-17 华侨大学 Method for Hash image retrieval based on deep learning and local feature fusion
CN106844620A (en) * 2017-01-19 2017-06-13 天津大学 A kind of characteristic matching method for searching three-dimension model based on view
CN108492364A (en) * 2018-03-27 2018-09-04 百度在线网络技术(北京)有限公司 The method and apparatus for generating model for generating image
CN109213884A (en) * 2018-11-26 2019-01-15 北方民族大学 A kind of cross-module state search method based on Sketch Searching threedimensional model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于残差网络的三维模型检索算法;李萌民;《计算机科学》;20190331;第46卷(第3期);全文 *

Also Published As

Publication number Publication date
CN110457515A (en) 2019-11-15

Similar Documents

Publication Publication Date Title
CN110457515B (en) 3D Model Retrieval Method Based on Multi-View Neural Network Based on Global Feature Capture Aggregation
Liu et al. Deep fitting degree scoring network for monocular 3d object detection
Qi et al. Volumetric and multi-view cnns for object classification on 3d data
CN108491880B (en) Object classification and pose estimation method based on neural network
Li et al. 3D IoU-Net: IoU guided 3D object detector for point clouds
CN104599275B (en) The RGB-D scene understanding methods of imparametrization based on probability graph model
CN112686331B (en) Forged image recognition model training method and forged image recognition method
CN111028327B (en) A processing method, device and equipment for a three-dimensional point cloud
AU2020104423A4 (en) Multi-View Three-Dimensional Model Retrieval Method Based on Non-Local Graph Convolutional Network
CN108648161A (en) The binocular vision obstacle detection system and method for asymmetric nuclear convolutional neural networks
CN113408584B (en) RGB-D multi-modal feature fusion 3D target detection method
CN104850850A (en) Binocular stereoscopic vision image feature extraction method combining shape and color
CN104091169A (en) Behavior identification method based on multi feature fusion
CN105205135B (en) A kind of 3D model retrieval methods and its retrieval device based on topic model
CN101477529A (en) Three-dimensional object retrieval method and apparatus
CN108564111A (en) A kind of image classification method based on neighborhood rough set feature selecting
CN109754006A (en) A method and system for stereo vision content classification based on view and point cloud fusion
CN111797269A (en) Multi-view 3D model retrieval method based on multi-level view association convolutional network
CN116703895B (en) Small sample 3D visual detection method and system based on generation countermeasure network
CN109740539A (en) 3D object recognition method based on extreme learning machine and fusion convolutional network
Yuan et al. Few-shot scene classification with multi-attention deepemd network in remote sensing
Liang et al. MVCLN: multi-view convolutional LSTM network for cross-media 3D shape recognition
CN114972794A (en) 3D Object Recognition Method Based on Multi-view Pooling Transformer
CN114266967A (en) Cross-source remote sensing data target identification method based on symbolic distance characteristics
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant