CN110457515B - Three-dimensional model retrieval method of multi-view neural network based on global feature capture aggregation - Google Patents
Three-dimensional model retrieval method of multi-view neural network based on global feature capture aggregation Download PDFInfo
- Publication number
- CN110457515B CN110457515B CN201910653415.1A CN201910653415A CN110457515B CN 110457515 B CN110457515 B CN 110457515B CN 201910653415 A CN201910653415 A CN 201910653415A CN 110457515 B CN110457515 B CN 110457515B
- Authority
- CN
- China
- Prior art keywords
- dimensional model
- dimensional
- model
- view
- neural network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5854—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Library & Information Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
A three-dimensional model retrieval method of a multi-view neural network based on global feature capture aggregation realizes the mining of internal connection among multi-view images, simultaneously realizes the effective aggregation of the multi-view images, and obtains a three-dimensional model shape feature descriptor with compact and high discrimination, and specifically comprises the following steps: (1) the method comprises the steps of model multi-view representation, (2) network model design, (3) the generation of the most difficult sample pairs, (4) network model training, (5) depth feature extraction, and (6) model retrieval, wherein the method uses non-local thought to mine the connection among multi-view networks, and simultaneously uses a weighted local aggregation layer to fuse the multi-view features, thereby obtaining a high-discrimination and compact three-dimensional model descriptor.
Description
Technical Field
The invention belongs to the field of computer vision and deep learning, and discloses a three-dimensional model retrieval method based on a multi-view neural network of global feature capture and aggregation to mine the internal relation between multi-view images representing a three-dimensional model, so that the performance of three-dimensional model retrieval is improved.
Background
With the great improvement of computer performance, a large number of three-dimensional models are generated in the fields of three-dimensional medicine, virtual reality and three-dimensional games, and how to identify and search the three-dimensional models becomes a research direction which is concerned by people in the current field of computer vision. Model representation methods in three-dimensional model retrieval can be divided into two types: 1) model-based representation methods, such as grid-or voxel-based discrete representation, but also point cloud-based representation methods. The feature design based on model representation is mostly based on the shape of the model itself and its geometrical properties, such as a three-dimensional histogram of hand design, a feature bag constructed by surface curvature and normal. 2) Based on a multi-view representation method, a three-dimensional model is represented using two-dimensional images acquired from different views. The representation method based on the two-dimensional image also has various manual design features, such as directional gradient histograms, zernike moments, SIFT features and the like.
However, the conventional manual design features are not good in search performance, and because the features obtained from different design algorithms are different in emphasis due to manual design, the model features cannot be comprehensively represented. With the wide application of deep learning techniques in the field of computer vision, such as classical AlexNet and google lenet deep convolutional neural networks. The data is automatically learned and fitted with the image features through the neural network, and compared with the manually designed features, the data can learn more comprehensive features, so that the image recognition effect is greatly improved. In multi-view-based three-dimensional model retrieval, each three-dimensional model has a plurality of view image representations, but the existing deep neural network is mainly used for identifying a single image, and the identification effect is limited by incompleteness of information. How to aggregate multi-view image information and how to capture the spatial characteristics of the model is the key to improve the retrieval performance of the three-dimensional model.
The method has the advantages that the multi-view image features cannot be simply and directly spliced by aggregating the multi-view information and capturing the model space features, and the splicing has various defects, so that the feature dimension is multiplied to increase, the retrieval time is increased, and the space features cannot be effectively captured by simple splicing, and the retrieval performance is not obviously improved.
Disclosure of Invention
The invention aims to solve the problems that multi-view features cannot be effectively aggregated and model space information is lost in the conventional method, and provides a three-dimensional model retrieval method of a multi-view neural network based on global feature capture aggregation.
The method is used for mining the internal relation among the multi-view images of the three-dimensional model, capturing the spatial information of the three-dimensional model and simultaneously improving the retrieval speed by fusing the multi-view features. The invention is specifically verified in three-dimensional model retrieval.
The invention provides a multi-view neural network three-dimensional model retrieval method based on global feature capture aggregation, which comprises the following steps:
1 st, multiview representation of three-dimensional models
The invention carries out the retrieval of the three-dimensional model based on the multi-view representation of the three-dimensional model, sets the view angle through processing software after obtaining the three-dimensional model data, and captures the view image of the corresponding view angle of the three-dimensional model.
2 nd, designing a network model
And designing a special double-chain deep neural network model according to the characteristics of three-dimensional model retrieval, and using the special double-chain deep neural network model for training and learning the characteristic representation suitable for the three-dimensional model. The double-chain deep neural network model comprises 5 parts, namely a low-dimensional convolution module, a non-local module, a high-dimensional convolution module, a weighted local aggregation layer and a classification layer. Meanwhile, a fusion loss function based on the central loss and the paired boundary loss is designed to increase the distinguishability between different types of three-dimensional models.
3, generating the most difficult sample pairs
The use of the double-chain deep neural network model requires the input in the form of sample pairs, and if all samples are paired, the number of generated sample pairs is extremely large. And the most difficult sample pairs are generated according to the principle that the samples in the same type are farthest away and the samples in different types are nearest.
4, training network model
And training a double-chain deep neural network by using a three-dimensional model training set, wherein the double-chain deep neural network learns the network model parameters capable of comprehensively representing the training data by an objective function.
5, extracting depth features
In the retrieval process, each three-dimensional model uses feature representation, and the invention uses the network model parameters trained in the step 4 to extract features. The network model is input into a plurality of view images representing a single three-dimensional model, and the features of the plurality of view images are aggregated into a three-dimensional model feature descriptor with high discrimination degree through feature extraction and aggregation of a double-chain deep neural network.
6 th, performing three-dimensional model search
Given a three-dimensional model, we want to find a three-dimensional model that is of the same kind as the three-dimensional model in the target dataset, i.e. a related three-dimensional model. The feature description and distance measurement method in the three-dimensional model retrieval is very important. The feature description uses the depth features extracted in the step 5, and the distance measurement method uses the Euclidean distance formula, and the calculation process is as follows.
x and y respectively represent three-dimensional models, wherein d (x, y) represents the distance between the two three-dimensional models, and xi,yiRespectively representing the i-dimensional features of x and the i-dimensional features of y.
The advantages and beneficial effects of the invention;
1) for multi-view images, a non-local module is used to mine potential relevance between the various view images.
2) And aggregating the captured high-dimensional features of each view by using a weighted local aggregation layer to obtain a high-discrimination three-dimensional model feature descriptor.
3) Through the two improvements, the invention achieves advanced performance in the three-dimensional model search, and the search result is shown in fig. 5.
Drawings
FIG. 1 is a double-chain deep neural network structure designed by the present invention.
FIG. 2 is a retrieval flow diagram of the present invention.
FIG. 3 is an example of a three-dimensional model data set.
FIG. 4 is a multi-perspective image of a three-dimensional model.
FIG. 5 is a comparison of the search performance results of the present invention method with the current advanced method on a ModelNet40 dataset. The corresponding documents of fig. 5 are as follows.
[1]You H,FengY,Ji R,et al.PVNet:A Joint Convolutional Network of Point Cloud and Multi-View for 3D Shape Recognition[J].acm multimedia,2018:1310-1318.
[2]He X,ZhouY,Zhou Z,et al.Triplet-Center Loss for Multi-view 3D Object Retrieval[J].computer vision andpattern recognition,2018:1945-1954.
[3]Yavartanoo M,Kim EY,Lee K M,et al.SPNet:Deep 3D Object Classification and Retrieval using Stereographic Projection.[J].arXiv:Computer Vision and Pattern Recognition,2018.
[4]Feng Y,Zhang Z,Zhao X,et al.GVCNN:Group-View Convolutional Neural Networks for 3D Shape Recognition[C].computer vision andpattern recognition,2018:264-272.
[5]Su H,Maji S,Kalogerakis E,et al.Multi-view Convolutional Neural Networks for 3D Shape Recognition[J].international conference on computer vision,2015:945-953.
[6]Bai S,Bai X,Zhou Z,et al.GIFT:A Real-Time and Scalable 3D Shape Search Engine[J].computer vision and pattern recognition,2016:5023-5032.
[7]Shi B,Bai S,Zhou Z,et al.DeepPano:Deep Panoramic Representation for 3-D Shape Recognition[J].IEEE Signal Processing Letters,2015,22(12):2339-2343.
[8]Sinha A,Bai J,Ramani K,et al.Deep Learning 3D Shape Surfaces Using Geometry Images[C].european conference on computer vision,2016:223-240.
[9]Wu Z,Song S,KhoslaA,et al.3D ShapeNets:A deep representation for volumetric shapes[J].computer vision and pattern recognition,2015:1912-1920.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
Example 1:
fig. 2 is a flowchart illustrating steps of a three-dimensional model retrieval method based on a multi-view neural network with global feature capture aggregation according to the present invention, and the specific operation steps are as follows.
Step one Multi-View representation of three-dimensional model
Three-dimensional models in the fields of medicine, games, industrial design, and the like are all represented as a single three-dimensional model, and are usually stored as a polygonal mesh, which is a collection of points connected to edges forming a surface. Therefore, in the method proposed by the present invention, the three-dimensional model needs to be represented by multi-view images, and to create a multi-view shape representation, we need to set up a view (virtual camera) to render each mesh. We created 12 rendered view images by placing 12 virtual cameras every 30 degrees around the grid, as shown in fig. 4.
Step two, designing a network model
According to the characteristics of three-dimensional model retrieval, a special double-chain deep neural network model is designed for training and learning the characteristic representation suitable for the three-dimensional model. The double-chain deep neural network model comprises 5 parts, namely a low-dimensional convolution module, a non-local module, a high-dimensional convolution module, a weighted local aggregation layer and a classification layer. Meanwhile, a fusion loss function based on the center loss and the paired boundary loss is designed to increase the distinguishability between different types of three-dimensional models.
The low-dimensional convolution module contains a convolution layer with convolution kernel of 7x7 and step size of 2, followed by a max pooling layer with kernel of 3x3 and step size of 2. The module is used to capture low-dimensional features of the extracted view. In the non-local module, the non-local module is used for mining the relation between the views, and a graph structure is constructed through the non-local idea to connect different view images. The formula of the non-local module is shown as follows.
Is provided with a view set V, VaIs the a view in V, VbIs the b view in V, yaTo correspond to vaTo output of (c). Become intoThe function g is used to calculate the correlation between the two views, the univariate function h is used to scale the input, and the function U is used for normalization.
For the convenience of convolution operation, the formula of the pairwise function g is as follows:
g(va,vb)=α(va)Tβ(vb) (3)
wherein α (v)a)=Wαva,β(va)=Wβvb,Wα,WβIs a weight matrix that can be learned. The normalization factor u (x) is N, where N is the number of views included in the view set V.
The unary function h is a linear function:
h(mb)=Whmb (4)
Whfor convolutional layer network parameters, a 1x1 convolution operation is used in the implementation. The non-local module is as follows:
za=Wzya+ma (5)
wherein Wzya+maDenotes a residual connection, yaCalculated from formula (2) to obtain maAs an original input, zaIs the final output of the non-local module. The module can be conveniently inserted into the existing network model through the implementation form, and the original model does not need to be adjusted to adapt to the existing network model.
A high-dimensional convolution module (four residual convolution modules) is added after the non-local modules to capture the high-level abstract features of the view. The first module comprises 6 convolutional layers with 3x3 cores, the second module comprises 8 convolutional layers with 3x3 cores, the third module comprises 12 convolutional layers with 3x3 cores, and the fourth module comprises 6 convolutional layers with 3x3 cores, wherein residual operation is performed on every two convolutional layers, and gradient explosion is effectively prevented through the residual operation. After high-level abstract features of the multiple views are extracted, a weighted local aggregation layer is used for self-learning the differentiated weight of the multiple views, a virtual class center thought is used for classifying the multiple view pictures, the virtual classes participate in the classification process, but the virtual classes are directly abandoned in the return process, so that the contribution degree of the pictures with low distinctiveness is reduced. The input of the layer is vector characteristics corresponding to each multi-view, and the output of the layer is class center residual error vector characteristics for removing virtual classes. And finally, aggregating the model descriptors into a compact model descriptor with high discrimination degree through a Max-posing operation. The classification layer uses Softmax classification.
A double-chain network is designed on the basis of single chains, and meanwhile, a plurality of loss functions are fused to learn more distinctive feature descriptors, namely a paired boundary loss function and a central loss function.
The pairwise boundary loss function is shown below:
Lb(xi,xj)=(1-yij)(α-dij)+yij[dij-(α-m)] (6)
xixja pair of samples, dijIs the Euclidean distance, yijIs the relevant weight. Where α is the boundary, m is the distance between the farthest positive sample and the nearest negative sample, Lb(xi,xj) The loss function value is obtained.
The center loss function is as follows:
wherein xiAs a sample feature, kciIs a class center of class i, LcThe corresponding central loss function value. The distance between the sample feature and the center of the corresponding class is calculated, and the corresponding loss function value is smaller when the distance is smaller, so that the effect of reducing the distance in the class is achieved, and the more distinctive feature descriptor is obtained.
By fusing the loss functions, the intra-class distance is reduced, the inter-class distance is increased, and the feature descriptors with higher distinctiveness can be independently learned.
Step three most difficult sample pair generation
During the process of generating the sample pairsIf there are p objects in each class, then generating the most difficult positive sample pairs, and selecting the k objects with the farthest distance in each class for each object. The positive sample pairs have a total of c · k · p sample pairs. When the positive and negative sample pairs are generated, each object needs to be matched with one object which is the nearest in other classes to be used as a nearest neighbor different class sample, and the total c.p2And (4) sample pairs.
Step four training network model
And training a double-chain deep neural network by using a three-dimensional model training set, wherein the double-chain deep neural network learns the network model parameters capable of comprehensively representing the training data by an objective function.
The invention can train the network model by using a Pythrch deep learning framework, and firstly, data preprocessing operation including data normalization, original image size unification, random cutting, image horizontal random inversion and image vertical inversion is required to be carried out on input data. The data normalization is used for normalizing the raw data to the statistical distribution on a fixed interval so as to ensure that the program convergence is accelerated. The reason why the original image size is unified is that the size of the network model is fixed after the network model is designed, and therefore the size of the input image is consistent with the size required by the network model. Random cropping, image flipping horizontally, and vertical inversion are to increase the amount of data to prevent the network model from overfitting. In the initial parameter setting, we set the iteration round number to 8, 20 sample pairs per iteration, where the initial learning rate is set to 0.0001, and the pre-trained network model parameters use network model parameters pre-trained on the large dataset ImageNet. Where the parameter a is set to 1.2 and m is set to 0.4. The present invention uses an adaptive gradient optimizer that can adaptively adjust the learning rate for different parameters. The step update range is calculated by comprehensively considering the first moment estimate and the second moment estimate of the gradient.
Step five, extracting depth features
In the retrieval process, each three-dimensional model uses characteristic representation, the invention uses a double-chain structure to train the network when training the network to obtain a trained neural network model, and then uses the network model parameters trained in the step 4 as the model parameters for extracting the characteristics. After the features are extracted, a single-chain network model parameter is obtained. The network model is input into a plurality of view images representing a single three-dimensional model, feature extraction and aggregation are carried out through the network structure provided by the invention, and the features of the plurality of view images are aggregated into the features of the single three-dimensional model. In the process of extracting the features, the dimension of the features extracted by the method is 512.
Step six three-dimensional model retrieval
Given a three-dimensional model, a three-dimensional model which belongs to the same kind as the three-dimensional model, namely a related three-dimensional model, is found in a target data set, a retrieval three-dimensional model set is set to be Q, a data set to be queried is set to be G, and the target is to find the three-dimensional model related to the three-dimensional model in Q in G. The realization form is to calculate the same three-dimensional model QiThe relevance of each three-dimensional model in the data set G is sorted according to the relevance size to obtain the relevance of the three-dimensional model QiA related three-dimensional model. The specific implementation form is shown as follows.
The three-dimensional model set and the data set to be inquired are retrieved by using the characteristic vector representation, and the invention uses the 5 th step to extract the characteristics. After the characteristic representation of each three-dimensional model in the retrieval data set and the data set to be inquired is obtained, the three-dimensional model Q is calculatediThe distance from each three-dimensional model in the data set G to be queried is expressed in the following form.
LijAs a three-dimensional model Qi,GjWherein f (Q)i,Gj) For the distance measurement method between two three-dimensional models, the distance measurement method of the invention uses Euclidean distance, and the calculation process is as follows:
wherein x and y represent different three-dimensional models, respectively, and d (x, y) represents twoDistance, x, between three-dimensional modelsi,yiRespectively representing the i-dimensional features of x and the i-dimensional features of y. Calculating to obtain QiAfter the distance from each three-dimensional model in G, the distances are sorted, and the first k distances can be taken as the same QiA related three-dimensional model. The results of the sequential calculations to obtain the three-dimensional model in G that is correlated to the three-dimensional model in Q are shown in fig. 5 for the modelnet40 dataset.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (1)
1. A three-dimensional model retrieval method of a multi-view neural network based on global feature capture aggregation is used for mining potential internal connection among multiple views of a three-dimensional model so as to improve the performance of three-dimensional model retrieval, and is characterized by comprising the following steps:
1 st, multiview representation of three-dimensional models
The method comprises the steps of retrieving a three-dimensional model based on multi-view representation of the three-dimensional model, setting a view angle through processing software after obtaining three-dimensional model data, and capturing a view image of the three-dimensional model corresponding to the view angle;
2 nd, designing a network model
Designing a special double-chain deep neural network model according to the characteristics of three-dimensional model retrieval, and training and learning the characteristic representation suitable for the three-dimensional model; the double-chain deep neural network model comprises 5 parts, namely a low-dimensional convolutional layer, a non-local module, a high-dimensional convolutional layer, a weighted local aggregation layer and a classification layer; meanwhile, a fusion loss function based on central loss and paired boundary loss is designed to increase the distinctiveness among different types of three-dimensional models;
3, generating the most difficult sample pairs
The double-chain deep neural network model is used by taking a sample pair form as input, if all samples are paired, the number of generated sample pairs is extremely large, and the most difficult sample pair is generated according to the principle that the distance between the same type of sample is the farthest and the distance between different types of samples is the closest;
4, training network model
Training a double-chain deep neural network by using a three-dimensional model training set, wherein the double-chain deep neural network learns the network model parameters capable of comprehensively representing training data by a target function;
5, extracting depth features
In the retrieval process, each three-dimensional model uses feature representation, the method uses the trained network model parameters in the step 4 to extract features, the network model is input into a plurality of view images representing a single three-dimensional model, and the features of the plurality of view images are aggregated into the single three-dimensional model features through the feature extraction and aggregation of the double-chain deep neural network;
6 th, performing three-dimensional model search
A three-dimensional model is given, and a three-dimensional model which belongs to the same kind as the three-dimensional model, namely a related three-dimensional model, is found in a target data set; the feature description and distance measurement method in the three-dimensional model retrieval is very important, the feature description uses the depth feature extracted in the step 5, the distance measurement method uses the Euclidean distance formula, and the calculation process is as follows:
x and y respectively represent different three-dimensional models, wherein d (x, y) represents the distance between the two three-dimensional models, and xi,yiRespectively representing the i-dimensional features of x and the i-dimensional features of y.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910653415.1A CN110457515B (en) | 2019-07-19 | 2019-07-19 | Three-dimensional model retrieval method of multi-view neural network based on global feature capture aggregation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910653415.1A CN110457515B (en) | 2019-07-19 | 2019-07-19 | Three-dimensional model retrieval method of multi-view neural network based on global feature capture aggregation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110457515A CN110457515A (en) | 2019-11-15 |
CN110457515B true CN110457515B (en) | 2021-08-24 |
Family
ID=68481530
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910653415.1A Active CN110457515B (en) | 2019-07-19 | 2019-07-19 | Three-dimensional model retrieval method of multi-view neural network based on global feature capture aggregation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110457515B (en) |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111078913A (en) * | 2019-12-16 | 2020-04-28 | 天津运泰科技有限公司 | Three-dimensional model retrieval method based on multi-view convolution neural network |
CN111382300B (en) * | 2020-02-11 | 2023-06-06 | 山东师范大学 | Multi-view three-dimensional model retrieval method and system based on pairing depth feature learning |
CN111340866B (en) * | 2020-02-26 | 2024-03-01 | 腾讯科技(深圳)有限公司 | Depth image generation method, device and storage medium |
CN111598130A (en) * | 2020-04-08 | 2020-08-28 | 天津大学 | Traditional Chinese medicine identification method based on multi-view convolutional neural network |
CN111513709B (en) * | 2020-05-20 | 2021-08-24 | 浙江大学 | Non-local neural network myocardial transmembrane potential reconstruction method based on iterative contraction threshold algorithm |
CN111914697A (en) * | 2020-07-16 | 2020-11-10 | 天津大学 | Multi-view target identification method based on view semantic information and sequence context information |
CN113869120B (en) * | 2021-08-26 | 2022-08-05 | 西北大学 | Aggregation convolution three-dimensional model classification method based on view filtering |
CN114238676A (en) * | 2021-12-22 | 2022-03-25 | 芯勍(上海)智能化科技股份有限公司 | MBD model retrieval method and device based on graph neural network |
CN116310425B (en) * | 2023-05-24 | 2023-09-26 | 山东大学 | Fine-grained image retrieval method, system, equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104268592A (en) * | 2014-09-22 | 2015-01-07 | 天津理工大学 | Multi-view combined movement dictionary learning method based on collaboration expression and judgment criterion |
CN106204467A (en) * | 2016-06-27 | 2016-12-07 | 深圳市未来媒体技术研究院 | A kind of image de-noising method based on cascade residual error neutral net |
CN106682233A (en) * | 2017-01-16 | 2017-05-17 | 华侨大学 | Method for Hash image retrieval based on deep learning and local feature fusion |
CN106844620A (en) * | 2017-01-19 | 2017-06-13 | 天津大学 | A kind of characteristic matching method for searching three-dimension model based on view |
WO2017156243A1 (en) * | 2016-03-11 | 2017-09-14 | Siemens Aktiengesellschaft | Deep-learning based feature mining for 2.5d sensing image search |
KR101854048B1 (en) * | 2016-11-25 | 2018-05-02 | 연세대학교 산학협력단 | Method and device for measuring confidence of depth map by stereo matching |
CN108492364A (en) * | 2018-03-27 | 2018-09-04 | 百度在线网络技术(北京)有限公司 | The method and apparatus for generating model for generating image |
CN109213884A (en) * | 2018-11-26 | 2019-01-15 | 北方民族大学 | A kind of cross-module state search method based on Sketch Searching threedimensional model |
-
2019
- 2019-07-19 CN CN201910653415.1A patent/CN110457515B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104268592A (en) * | 2014-09-22 | 2015-01-07 | 天津理工大学 | Multi-view combined movement dictionary learning method based on collaboration expression and judgment criterion |
WO2017156243A1 (en) * | 2016-03-11 | 2017-09-14 | Siemens Aktiengesellschaft | Deep-learning based feature mining for 2.5d sensing image search |
CN106204467A (en) * | 2016-06-27 | 2016-12-07 | 深圳市未来媒体技术研究院 | A kind of image de-noising method based on cascade residual error neutral net |
KR101854048B1 (en) * | 2016-11-25 | 2018-05-02 | 연세대학교 산학협력단 | Method and device for measuring confidence of depth map by stereo matching |
CN106682233A (en) * | 2017-01-16 | 2017-05-17 | 华侨大学 | Method for Hash image retrieval based on deep learning and local feature fusion |
CN106844620A (en) * | 2017-01-19 | 2017-06-13 | 天津大学 | A kind of characteristic matching method for searching three-dimension model based on view |
CN108492364A (en) * | 2018-03-27 | 2018-09-04 | 百度在线网络技术(北京)有限公司 | The method and apparatus for generating model for generating image |
CN109213884A (en) * | 2018-11-26 | 2019-01-15 | 北方民族大学 | A kind of cross-module state search method based on Sketch Searching threedimensional model |
Non-Patent Citations (1)
Title |
---|
基于残差网络的三维模型检索算法;李萌民;《计算机科学》;20190331;第46卷(第3期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110457515A (en) | 2019-11-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110457515B (en) | Three-dimensional model retrieval method of multi-view neural network based on global feature capture aggregation | |
Qi et al. | Volumetric and multi-view cnns for object classification on 3d data | |
Xian et al. | Monocular relative depth perception with web stereo data supervision | |
CN110543581B (en) | Multi-view three-dimensional model retrieval method based on non-local graph convolution network | |
Li et al. | SHREC’13 track: large scale sketch-based 3D shape retrieval | |
CN109063139B (en) | Three-dimensional model classification and retrieval method based on panorama and multi-channel CNN | |
CN109063649B (en) | Pedestrian re-identification method based on twin pedestrian alignment residual error network | |
CN114255238A (en) | Three-dimensional point cloud scene segmentation method and system fusing image features | |
CN111625667A (en) | Three-dimensional model cross-domain retrieval method and system based on complex background image | |
CN109766873B (en) | Pedestrian re-identification method based on hybrid deformable convolution | |
CN105205135B (en) | A kind of 3D model retrieval methods and its retrieval device based on topic model | |
CN111723600B (en) | Pedestrian re-recognition feature descriptor based on multi-task learning | |
CN112183240A (en) | Double-current convolution behavior identification method based on 3D time stream and parallel space stream | |
CN112329662B (en) | Multi-view saliency estimation method based on unsupervised learning | |
CN111797269A (en) | Multi-view three-dimensional model retrieval method based on multi-level view associated convolutional network | |
Liang et al. | MVCLN: multi-view convolutional LSTM network for cross-media 3D shape recognition | |
CN114299339A (en) | Three-dimensional point cloud model classification method and system based on regional correlation modeling | |
CN111191729B (en) | Three-dimensional object fusion feature representation method based on multi-modal feature fusion | |
CN117312594A (en) | Sketching mechanical part library retrieval method integrating double-scale features | |
Liu et al. | Deep learning of directional truncated signed distance function for robust 3D object recognition | |
CN117351078A (en) | Target size and 6D gesture estimation method based on shape priori | |
Özyurt et al. | A new method for classification of images using convolutional neural network based on Dwt-Svd perceptual hash function | |
Xu et al. | Learning discriminative and generative shape embeddings for three-dimensional shape retrieval | |
CN114741549A (en) | Image duplicate checking method and device based on LIRE, computer equipment and storage medium | |
Li et al. | Dynamic local feature aggregation for learning on point clouds |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |