CN111368123B - Three-dimensional model sketch retrieval method based on cross-modal guide network - Google Patents

Three-dimensional model sketch retrieval method based on cross-modal guide network Download PDF

Info

Publication number
CN111368123B
CN111368123B CN202010097592.9A CN202010097592A CN111368123B CN 111368123 B CN111368123 B CN 111368123B CN 202010097592 A CN202010097592 A CN 202010097592A CN 111368123 B CN111368123 B CN 111368123B
Authority
CN
China
Prior art keywords
dimensional model
sketch
network
features
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010097592.9A
Other languages
Chinese (zh)
Other versions
CN111368123A (en
Inventor
梁爽
戴伟东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202010097592.9A priority Critical patent/CN111368123B/en
Publication of CN111368123A publication Critical patent/CN111368123A/en
Application granted granted Critical
Publication of CN111368123B publication Critical patent/CN111368123B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Abstract

The invention relates to a three-dimensional model sketch retrieval method based on a cross-mode wizard network, which comprises the following steps: s1: acquiring three-dimensional model training data and sketch training data; s2: training a three-dimensional model network, and learning by using the trained three-dimensional model network to obtain a three-dimensional model characteristic space; s3: training a sketch network by taking the three-dimensional model feature space as a target space to obtain a trained sketch network; s4: the method comprises the steps of utilizing the three-dimensional model network and the sketch network which are trained to extract the three-dimensional model characteristics to be retrieved and the query sketch characteristics to retrieve and obtain the three-dimensional model for corresponding application.

Description

Three-dimensional model sketch retrieval method based on cross-modal guide network
Technical Field
The invention relates to the field of three-dimensional model retrieval based on sketches, in particular to a three-dimensional model sketches retrieval method based on a cross-mode guide network.
Background
Compared with a two-dimensional image, the three-dimensional model has richer information and can comprehensively reflect objective reality, so that the three-dimensional model is widely applied to various fields such as buildings, medical treatment and the like. In recent years, as three-dimensional scanning, three-dimensional printing, and three-dimensional reconstruction techniques have matured, the number of three-dimensional models has also grown rapidly. How to effectively retrieve these three-dimensional models from a three-dimensional model library becomes a major concern. Early search methods were mainly based on keyword search and on three-dimensional model instance search. On one hand, the method has the defects that a large amount of text labels need to be carried out on a three-dimensional model library in advance, and the method is time-consuming and labor-consuming; another aspect is that keywords can hardly describe the query needs of people intuitively. Retrieval based on an example three-dimensional model is straightforward, but in practice it is difficult to implement because people rarely obtain a three-dimensional model as input to a query. In recent years, hand-drawn sketches have become a more popular way of human-computer interaction. Compared with a three-dimensional model, a hand-drawn sketch is very convenient to obtain; compared with keywords, the hand-drawn sketch can express the requirements of people more intuitively. Therefore, the search of three-dimensional models based on sketches is a research direction which is receiving much attention in the field of computer vision.
Early methods based on three-dimensional model sketch retrieval were mainly based on artificially designed features. The methods respectively design corresponding manual features for the sketch and the three-dimensional model, and then directly measure the similarity between the cross-modal features. For example, the method proposed by Eitz et al based on Gabor local linear feature GALIF (Gabor local line based feature); the Cross-Domain Manifold Ranking (CDMR) method proposed by Furuya et al.
In recent years, with the great success of deep learning in the field of computer vision, a variety of deep learning methods are applied to sketch-based three-dimensional model retrieval. Most of the deep learning methods are based on feature extraction of heterogeneous twin convolutional neural networks, namely, deep feature extraction is respectively carried out on a sketch and a three-dimensional model by using two networks, and then matching and metric learning are carried out on features of two modes by using a shared loss function. These methods fall into two broad categories from the point of view of the loss function: one is that the task of three-dimensional model sketch retrieval is regarded as measurement learning, for a sketch, a plurality of positive and negative sample pairs are constructed through the sketch and a three-dimensional model, then a measurement loss function is utilized to optimize a network, so that the positive sample pairs are close to each other, the negative sample pairs are far away from each other, and finally cross-modal features are aligned; representative examples of such methods include the Siamese method proposed by Wang et al, and the Deep Correlated Metric Learning (DCML) method proposed by Dai et al. The other type is to fully utilize the category information, regard the task as a classification task, input the characteristics output by the sketch network and the three-dimensional model network into a shared classifier, and then utilize a classification loss function with discriminant to simultaneously optimize the two networks, so that the sketch and the three-dimensional model of the same category are aggregated, and the sketch and the three-dimensional model of different categories are separated as much as possible; representative examples of such methods are the triple-Center loss function (TCL) method proposed by He et al, the "point-to-subspace" method proposed by Lei et al, and the like. Because the deep neural network can learn deeper features, the deep learning method makes great progress in the performance of three-dimensional model sketch retrieval.
However, in these deep learning methods, two neural networks are used to extract features of two modalities at the same time, and then the extracted features of the two modalities are directly mapped into a common subspace. Therefore, the method for directly mapping the cross-modal characteristics is difficult to effectively reduce the cross-modal difference between the sketch and the three-dimensional model, and further influences the performance of cross-modal retrieval.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a cross-modal-state-oriented-network-based three-dimensional model sketch retrieval method which effectively overcomes the problem of cross-modal-state difference.
The purpose of the invention can be realized by the following technical scheme:
a three-dimensional model sketch retrieval method based on a cross-modal guide network comprises the following steps:
s1: acquiring three-dimensional model training data and sketch training data;
s2: training a three-dimensional model network, and learning by using the trained three-dimensional model network to obtain a three-dimensional model characteristic space;
s3: training a sketch network by taking the three-dimensional model feature space as a target space to obtain a trained sketch network;
s4: and retrieving to-be-retrieved three-dimensional model features and query sketch features extracted by the trained three-dimensional model network and sketch network to obtain the three-dimensional model for corresponding application.
Further, the step S2 specifically includes:
s21: constructing a three-dimensional model network;
s22: using a classification loss function LAM-SInputting the three-dimensional model training data into a three-dimensional model network for training to obtain a trained three-dimensional model network;
s23: inputting the three-dimensional model training data into a trained three-dimensional model network, and learning to obtain three-dimensional model characteristics of all three-dimensional model training data and a classified three-dimensional model characteristic space;
s24: and calculating the class center of each class of three-dimensional model features in the three-dimensional model feature space according to the class information.
Further preferably, said classification loss function LAM-SThe method is an AM-softmax classification loss function, and the expression of the function is as follows:
Figure BDA0002385699830000031
wherein f iskTo enter the three-dimensional model features of the classifier,
Figure BDA0002385699830000032
the weight of the classifier is obtained, n is a boundary coefficient, and s is a scaling coefficient after the weight and the three-dimensional model feature are normalized.
Further, the step S3 specifically includes:
s31: constructing a sketch network;
s32: constructing a guide loss function L by using class center and class information of three-dimensional model featuresG
S33: using a guided loss function LGAnd inputting the sketch training data into a sketch network for training to obtain the trained sketch network.
The guide loss function LGAnd constraining the sketch features extracted from the sketch network into a three-dimensional model feature space, and aligning the sketch features with the same category information with the three-dimensional model features.
Further, the guide loss function LGThe expression of (a) is:
LG=Lc-λLa
Figure BDA0002385699830000033
Figure BDA0002385699830000034
wherein L iscCosine distance, L, of class center for sketch features and three-dimensional model features of the same classaIs the sum of cosine distances of class centers of sketch features and three-dimensional model features of different classes, lambda is a hyper-parameter and takes a value of 0.01, m is the size of one batch of input data during sketch network training, fiIs the sketch feature of the ith sketch, yi is the category of the ith sketch feature, c is the category center of the three-dimensional model feature, cyiClass center of three-dimensional model feature same as class of ith sketch feature, cjAnd N is the total number of the three-dimensional model feature categories.
Preferably, the three-dimensional model network comprises a first deep convolutional neural network CNN1And a first full connection layer FC1The sketch network comprises a second deep convolutional neural network CNN2And a second full connection layer FC2
Preferably, the three-dimensional model training data includes two-dimensional view maps corresponding to all three-dimensional models in the three-dimensional model data set, and the size of the sketch training data is the same as that of the two-dimensional view map of the three-dimensional model.
The first deep convolutional neural network CNN1The first full-connection layer FC is used for extracting the features of each two-dimensional view map and performing feature fusion1Outputting three-dimensional model features, the second full-connected layer FC2And outputting the sketch characteristics.
Further, the step S4 specifically includes:
s41: rendering all three-dimensional models to be retrieved into a two-dimensional view map;
s42: inputting a two-dimensional view map of a three-dimensional model to be retrieved into a three-dimensional model network, and extracting characteristics of the three-dimensional model to be retrieved; inputting the query sketch into a sketch network, and extracting the characteristics of the query sketch;
s43: calculating the cosine distance between the query sketch features and the three-dimensional model features to be retrieved, and sequencing;
s44: and sequentially outputting the three-dimensional model corresponding to each distance according to the sequencing result to finish the three-dimensional model retrieval.
Compared with the prior art, the invention has the following advantages:
1) according to the invention, the cross-modal feature space is indirectly learned, the three-dimensional model feature space with strong discriminability is trained in advance by utilizing the characteristic of abundant features of the three-dimensional model, and then the feature space is utilized to guide the training of a sketch network, and the sketch features are transferred to the three-dimensional model feature space, so that the cross-modal difference between the sketch and the three-dimensional model is effectively reduced;
2) The guide loss function of the present invention comprises two parts, where LcSo that the sketch features and the three-dimensional model features with the same class information are gathered as much as possible, LaThe sketch features and the three-dimensional model features of different classes are separated as much as possible, so the sketch feature and the three-dimensional model features with the same class information are aligned better by using the guide loss function to constrain sketch network training, and finally, the three-dimensional model retrieval precision based on the sketch is effectively improved.
Drawings
FIG. 1 is a schematic work flow diagram of the overall framework of the present invention;
FIG. 2 is a flow chart of a method provided in an embodiment;
FIG. 3 is a guide loss function LGSchematic diagram of (a);
FIG. 4 is a PR plot of the method of the present invention and other methods on a SHREC 2013 data set;
fig. 5 is a PR plot of the method of the present invention and other methods on the SHREC 2014 data set.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.
Examples
As shown in fig. 2, the method for retrieving a three-dimensional model sketch based on a cross-modal wizard network provided by the present invention mainly comprises the following five steps:
1) rendering each three-dimensional model in the three-dimensional model data set into a plurality of two-dimensional view maps;
2) inputting a plurality of two-dimensional view maps into a three-dimensional model network, and representing the three-dimensional model characteristics by using the two-dimensional view map characteristics obtained by training;
3) guiding the training of the sketch network by using the three-dimensional model characteristics obtained in the step 2), and learning the sketch characteristics into a three-dimensional model characteristic space;
4) for the three-dimensional model library to be retrieved, extracting the characteristics of the three-dimensional model to be retrieved by using the three-dimensional model network trained in the step 2), and for the query sketch, extracting the characteristics of the sketch by using the sketch network trained in the step 3);
5) and calculating the cosine distance between the characteristics of the query sketch and the characteristics of each three-dimensional model to be retrieved in the three-dimensional model library, and sequencing to complete the three-dimensional model retrieval based on the sketch.
As shown in fig. 1, a schematic workflow of the present invention is shown. The present method will be described in detail in the following sections of the specification as well.
The specific method of the step 1) comprises the following steps:
firstly, uniformly placing 12 virtual cameras around a three-dimensional model, namely placing one virtual camera around the three-dimensional model at intervals of 30 degrees, then rendering the three-dimensional model into a two-dimensional view angle image by each virtual camera from different view angles, and finally obtaining 12 two-dimensional view angle images by each three-dimensional model.
The specific method of the step 2) comprises the following steps:
21) the size of the two-dimensional view map of the three-dimensional model is unified into 224 × 224 and normalized to be between 0 and 1. Then, sequentially inputting the 12 two-dimensional view maps of each model into a first deep Convolutional Neural Network (CNN)1And obtaining the characteristics of 12 two-dimensional view maps.
22) At a first deep convolutional neural network CNN1The 12 two-dimensional view map features are feature-fused by using an average pooling (Mean-posing) layer at the tail end of the first full-connection layer FC to obtain a single feature, and the feature after fusion is input into the first full-connection layer FC1And further feature extraction is carried out.
23) The first full connection layer FC1The output characteristics are sent to a classifier, and the whole three-dimensional model network (marked as CNN) is optimized by using an AM-softmax classification loss function1-FC1) And obtaining the trained three-dimensional model network.
Wherein the first deep convolutional neural network CNN1Supporting arbitrary forms of convolutional networks, first full link layer FC1Any form of fully connected network is also supported.
The AM-softmax classification loss function is an improvement of softmax, and can reduce cosine distances among data of the same category and increase cosine distances among different categories, so that the learned features are more distinctive. The expression is as follows:
Figure BDA0002385699830000061
Wherein f iskIs the three-dimensional model feature entered into the classifier;
Figure BDA0002385699830000062
is the weight of the classifier; n is a boundary coefficient for constraining the cosine distance intervals between different classes; and s is a scaling factor after the weight and the three-dimensional model feature are normalized so as to facilitate the convergence of the three-dimensional model network training.
24) Inputting the three-dimensional model training data into the three-dimensional model network trained in the step 23) again, and extracting the first full connecting layer FC1Obtaining three-dimensional model characteristics of three-dimensional model training data and a three-dimensional model characteristic space with separable categories by the output three-dimensional model characteristics, calculating the category centers of the three-dimensional model characteristics according to category information, and recording the category centers as a set C ═ C { (C {)1,c2,c3,…,cNWhere N represents the number of categories of the data set.
The specific method of the step 3) is as follows:
31) unifying the sketch to 224 × 224 size and normalizing to 0-1, and inputting the sketch into a second deep convolutional neural network CNN2And a second full connection layer FC2Here, the sketch network (denoted as CNN)2-FC2) And the three-dimensional model network CNN in step 2)1-FC1Are identical but the parameters are different.
32) Taking the three-dimensional model feature space learned in advance in the step 2) as a target space, and constructing a guide loss function and recording the guide loss function as L by utilizing the category center and the category information of the three-dimensional model feature learned in advance GGuide loss function LGWith the purpose of making the second fully-connected layer FC as shown in fig. 32The output sketch features are transferred into a three-dimensional model feature space to reduce cross-modal feature difference, and the sketch features and the three-dimensional model features with the same category information are gathered together to realize cross-modal feature alignment and guide loss function LGThe calculation formula of (a) is as follows:
LG=Lc-λLa
wherein L iscIndicating a second full connection layer FC2The cosine distances of the output sketch features and the centers of the three-dimensional model classes of the same class are combined with the graph 3, and the function of the output sketch features and the three-dimensional model features with the same class information is to enable the sketch features and the three-dimensional model features with the same class information to be gathered as much as possible; l isaIndicating a second full connection layer FC2The sum of cosine distances of the output sketch features and centers of other different classes is combined with the graph 3, the function of the sum is to separate the sketch features from three-dimensional model features of different classes as far as possible, lambda is a hyper-parameter and is used for balancing the weight of the sketch features and the three-dimensional model features of different classes, and the value of lambda is set to be 0.01.
In particular, LcAnd LaThe formula of (1) is as follows:
Figure BDA0002385699830000071
Figure BDA0002385699830000072
wherein m represents the size of a batch of batch input into the sketch network during training; f. ofiA sketch feature representing the ith sketch; yi represents category information of the ith sketch feature; c represents the class center of the three-dimensional model learned in step 2), with the subscript representing the center of which class it belongs to; n represents the total number of categories.
33) On-guide loss function LGUnder the constraint of (2), training the sketch network CNN2-FC2Second full connection layer FC2The output sketch depth features are constrained in a three-dimensional model feature space, and simultaneously, sketch features and three-dimensional model features of the same category are aligned, so that a cross-modal feature aligned shared feature space is finally obtained, and the feature space has small cross-modal differences and effective cross-modal feature alignment, so that the purposes of reducing the cross-modal feature differences and aligning the cross-modal features are achieved.
The specific method of the step 4) comprises the following steps:
for a three-dimensional model library to be retrieved, rendering all three-dimensional models into a two-dimensional view map by utilizing the step 1), and then utilizing the three-dimensional model network CNN trained in the step 2)1-FC1And extracting three-dimensional model characteristics from all the three-dimensional models to obtain a three-dimensional model characteristic library. For any query sketch, utilizing the sketch network CNN trained in the step 3)2-FC2And extracting sketch features of the query sketch to obtain query sketch features.
The specific method of the step 5) comprises the following steps:
calculating cosine distances between the query sketch features obtained in the step 4) and each model feature in the three-model feature library, sequencing the cosine distances from small to large, outputting a three-dimensional model corresponding to each distance according to a sequencing result, and completing three-dimensional model retrieval based on the sketch.
To support and verify the performance of the sketch-based three-dimensional model retrieval method proposed by the present invention, our method is applied here on two widely used public standard datasets. The two datasets are the SHREC2013 dataset and the SHREC2014 dataset. The SHREC2013 data set has 1258 three-dimensional models in total, the models are divided into 90 categories, the number of samples in each category is unbalanced, and the average number of the three-dimensional models in each category is 14. The sketch set of the data set comprises 7200 hand-drawn sketches which are divided into 90 categories corresponding to the three-dimensional models. The sketch for each class is 80 samples, including 50 training samples and 30 test samples. The SHREC2014 is an extension of the SHREC2013 data set, and has more categories and larger scale. The dataset has a total of 8987 three-dimensional models, divided into 171 categories. The sketch set has 13680 freehand sketches, 171 categories, and as with the SHREC2013 dataset, the sketch for each category is 80 samples, including 50 training samples and 30 test samples. The data set is large in scale, more in types, more unbalanced in sample distribution and larger in-class difference of each type, so that the retrieval difficulty is higher, and the algorithm weighing performance is better.
Experiments are carried out on an SHREC 2013 data set and an SHREC 2014 data set, seven general indexes of Precision-Recall curve (Precision-Recall curve, PR curve for short), Nearest Neighbor accuracy (Nearest Neighbor, NN for short), First-level matching accuracy (First Tier, FT for short), Second-level matching accuracy (Second Tier, ST for short), E-Measure (E-Measure, E for short), broken loss Cumulative Gain (DCG for short) and Average accuracy mean (mean Average prediction, mAP for short) are taken as measuring standards, and the method is compared with other latest and foremost sketch-based three-dimensional model retrieval methods. For fair comparison, we use the same basic model as other methods, namely CNN1And CNN2Experiments were carried out using AlexNet, VGG16, VGG19, and ResNet-50 structures, respectively. Search results and comparison on the SHREC 2013 dataset:
fig. 4 shows PR plots of the method of the present invention and other methods on the SHREC 2013 data set. As can be seen in the figure, the retrieval performance of the method is obviously superior to that of other methods. Table 1 also gives the data for the present method and other methods on the SHREC 2013 dataset under six evaluation criteria, and our method is overall superior to other leading edge methods on six criteria using the same underlying model.
Table 1 retrieval performance (%) comparison on SHREC 2013 dataset
Figure BDA0002385699830000081
Figure BDA0002385699830000091
Search results and comparisons on the SHREC 2014 dataset:
the SHREC 2014 data set is a larger and more difficult data set, and the retrieval performance of the method is also higher. Fig. 5 shows a PR plot of the method of the present invention on a SHREC 2014 data set versus other methods. It can be seen that the retrieval performance of the method is still better than that of other methods on more difficult data sets. Table 2 gives the data for the present method and other methods at six evaluation indices on the SHREC 2014 data set, and our method is still overall superior to other leading edge methods at six indices. This demonstrates that the methods herein can achieve superior search performance on difficult datasets as well.
Table 2 retrieve performance (%) comparison on SHREC 2014 data set
Figure BDA0002385699830000092
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and those skilled in the art can easily conceive of various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A three-dimensional model sketch retrieval method based on a cross-mode wizard network is characterized by comprising the following steps:
step S1: acquiring three-dimensional model training data and sketch training data;
step S2: training a three-dimensional model network, and learning by using the trained three-dimensional model network to obtain a three-dimensional model feature space, wherein the step S2 specifically includes:
s21: constructing a three-dimensional model network;
s22: using a classification loss function LAM-SInputting the three-dimensional model training data into a three-dimensional model network for training to obtain a trained three-dimensional model network;
s23: inputting the three-dimensional model training data into a trained three-dimensional model network, and learning to obtain three-dimensional model characteristics of all three-dimensional model training data and a classified three-dimensional model characteristic space;
s24: calculating the category center of each category of three-dimensional model features in the three-dimensional model feature space according to the category information;
step S3: training a sketch network by taking the three-dimensional model feature space as a target space to obtain the trained sketch network, wherein the step S3 specifically comprises the following steps:
s31: constructing a sketch network;
s32: constructing a guide loss function L by using class center and class information of three-dimensional model features G
S33: using a guided loss function LGInputting the sketch training data into a sketch network for training to obtain a trained sketch network;
step S4: and extracting the characteristics of the three-dimensional model to be retrieved and the characteristics of the query sketch by using the trained three-dimensional model network and the trained sketch network, and retrieving to obtain the three-dimensional model for corresponding application.
2. The three-dimensional model of claim 1 based on cross-mode wizard networkThe sketch retrieval method is characterized in that the guide loss function LGAnd constraining the sketch features extracted from the sketch network into a three-dimensional model feature space, and aligning the sketch features with the same category information with the three-dimensional model features.
3. The method as claimed in claim 2, wherein the wizard loss function L is a function of lossGThe expression of (a) is:
LG=Lc-λLa
Figure FDA0003541251600000011
Figure FDA0003541251600000021
wherein L iscCosine distance, L, of class center for sketch features and three-dimensional model features of the same classaIs the sum of cosine distances of class centers of sketch features and three-dimensional model features of different classes, lambda is a hyper-parameter and takes a value of 0.01, m is the size of one batch of input data during sketch network training, f iIs the sketch feature of the ith sketch, yi is the category of the ith sketch feature, c is the category center of the three-dimensional model feature, cyiClass center of three-dimensional model feature same as class of ith sketch feature, cjAnd N is the total number of the three-dimensional model feature categories.
4. The method as claimed in claim 1, wherein the classification loss function L is a function of a model lossAM-SThe method is an AM-softmax classification loss function, and the expression of the function is as follows:
Figure FDA0003541251600000022
wherein f iskTo enter the three-dimensional model features of the classifier,
Figure FDA0003541251600000023
the weight of the classifier is obtained, n is a boundary coefficient, and s is a scaling coefficient after the weight and the three-dimensional model feature are normalized.
5. The method as claimed in claim 1, wherein the three-dimensional model network comprises a first deep Convolutional Neural Network (CNN)1And a first full connection layer FC1The sketch network comprises a second deep convolutional neural network CNN2And a second full connection layer FC2
6. The method as claimed in claim 5, wherein the three-dimensional model training data includes two-dimensional view maps corresponding to all three-dimensional models in the three-dimensional model data set, and the size of the three-dimensional model training data is the same as the size of the two-dimensional view maps of the three-dimensional models.
7. The method as claimed in claim 6, wherein the first deep convolutional neural network CNN is a deep convolutional neural network1The first full-connection layer FC is used for extracting the characteristics of each two-dimensional view map and carrying out characteristic fusion1Outputting three-dimensional model features, the second full connection layer FC2And outputting the sketch characteristics.
8. The method for retrieving the three-dimensional model sketch based on the cross-modal guidance network as claimed in claim 7, wherein said step S4 specifically comprises:
s41: rendering all three-dimensional models to be retrieved into a two-dimensional view map;
s42: inputting a two-dimensional view map of a three-dimensional model to be retrieved into a three-dimensional model network, and extracting characteristics of the three-dimensional model to be retrieved; inputting the query sketch into a sketch network, and extracting the characteristics of the query sketch;
s43: calculating cosine distances between the query sketch features and all three-dimensional model features to be retrieved, and sequencing;
s44: and sequentially outputting the three-dimensional model corresponding to each distance according to the sequencing result to finish the three-dimensional model retrieval.
CN202010097592.9A 2020-02-17 2020-02-17 Three-dimensional model sketch retrieval method based on cross-modal guide network Active CN111368123B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010097592.9A CN111368123B (en) 2020-02-17 2020-02-17 Three-dimensional model sketch retrieval method based on cross-modal guide network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010097592.9A CN111368123B (en) 2020-02-17 2020-02-17 Three-dimensional model sketch retrieval method based on cross-modal guide network

Publications (2)

Publication Number Publication Date
CN111368123A CN111368123A (en) 2020-07-03
CN111368123B true CN111368123B (en) 2022-06-28

Family

ID=71206318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010097592.9A Active CN111368123B (en) 2020-02-17 2020-02-17 Three-dimensional model sketch retrieval method based on cross-modal guide network

Country Status (1)

Country Link
CN (1) CN111368123B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112199543B (en) * 2020-10-14 2022-10-28 哈尔滨工程大学 Confrontation sample generation method based on image retrieval model
CN113033438B (en) * 2021-03-31 2022-07-01 四川大学 Data feature learning method for modal imperfect alignment
CN113656616B (en) * 2021-06-23 2024-02-27 同济大学 Three-dimensional model sketch retrieval method based on heterogeneous twin neural network
CN113554115B (en) * 2021-08-12 2022-09-13 同济大学 Three-dimensional model sketch retrieval method based on uncertain learning
CN117473105B (en) * 2023-12-28 2024-04-05 浪潮电子信息产业股份有限公司 Three-dimensional content generation method based on multi-mode pre-training model and related components

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015201151A (en) * 2014-04-04 2015-11-12 国立大学法人豊橋技術科学大学 Three-dimensional model retrieval system, and three-dimensional model retrieval method
CN109033144A (en) * 2018-06-11 2018-12-18 厦门大学 Method for searching three-dimension model based on sketch
CN109213884A (en) * 2018-11-26 2019-01-15 北方民族大学 A kind of cross-module state search method based on Sketch Searching threedimensional model
CN110188228A (en) * 2019-05-28 2019-08-30 北方民族大学 Cross-module state search method based on Sketch Searching threedimensional model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015201151A (en) * 2014-04-04 2015-11-12 国立大学法人豊橋技術科学大学 Three-dimensional model retrieval system, and three-dimensional model retrieval method
CN109033144A (en) * 2018-06-11 2018-12-18 厦门大学 Method for searching three-dimension model based on sketch
CN109213884A (en) * 2018-11-26 2019-01-15 北方民族大学 A kind of cross-module state search method based on Sketch Searching threedimensional model
CN110188228A (en) * 2019-05-28 2019-08-30 北方民族大学 Cross-module state search method based on Sketch Searching threedimensional model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于深度学习的三维模型检索研究;张静;《智能计算机与应用》;20190501;第9卷(第3期);全文 *
基于联合特征映射的端到端三维模型草图检索;白静;《计算机辅助设计与图形学学报》;20191215;全文 *

Also Published As

Publication number Publication date
CN111368123A (en) 2020-07-03

Similar Documents

Publication Publication Date Title
CN111368123B (en) Three-dimensional model sketch retrieval method based on cross-modal guide network
JP2018205937A (en) Image retrieval device and program
CN111127385B (en) Medical information cross-modal Hash coding learning method based on generative countermeasure network
CN110298395B (en) Image-text matching method based on three-modal confrontation network
CN114220124A (en) Near-infrared-visible light cross-modal double-flow pedestrian re-identification method and system
CN109840290B (en) End-to-end depth hash-based dermoscope image retrieval method
CN108520213B (en) Face beauty prediction method based on multi-scale depth
CN110135459A (en) A kind of zero sample classification method based on double triple depth measure learning networks
CN106844620B (en) View-based feature matching three-dimensional model retrieval method
CN110097060A (en) A kind of opener recognition methods towards trunk image
CN111324765A (en) Fine-grained sketch image retrieval method based on depth cascade cross-modal correlation
WO2020107922A1 (en) 3d fingerprint image-based gender recognition method and system
CN114549850B (en) Multi-mode image aesthetic quality evaluation method for solving modal missing problem
CN112015868A (en) Question-answering method based on knowledge graph completion
CN115049952B (en) Juvenile fish limb identification method based on multi-scale cascade perception deep learning network
Nahar et al. Fingerprint classification using deep neural network model resnet50
CN110580339A (en) Method and device for perfecting medical term knowledge base
CN114241191A (en) Cross-modal self-attention-based non-candidate-box expression understanding method
CN114510594A (en) Traditional pattern subgraph retrieval method based on self-attention mechanism
CN111914912B (en) Cross-domain multi-view target identification method based on twin condition countermeasure network
CN115878832B (en) Ocean remote sensing image audio retrieval method based on fine pair Ji Panbie hash
CN116956138A (en) Image gene fusion classification method based on multi-mode learning
Li et al. Facial age estimation by deep residual decision making
CN113887653B (en) Positioning method and system for tight coupling weak supervision learning based on ternary network
CN113076490B (en) Case-related microblog object-level emotion classification method based on mixed node graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant