CN113032613A - Three-dimensional model retrieval method based on interactive attention convolution neural network - Google Patents

Three-dimensional model retrieval method based on interactive attention convolution neural network Download PDF

Info

Publication number
CN113032613A
CN113032613A CN202110270518.7A CN202110270518A CN113032613A CN 113032613 A CN113032613 A CN 113032613A CN 202110270518 A CN202110270518 A CN 202110270518A CN 113032613 A CN113032613 A CN 113032613A
Authority
CN
China
Prior art keywords
model
view
neural network
sketch
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110270518.7A
Other languages
Chinese (zh)
Other versions
CN113032613B (en
Inventor
贾雯惠
高雪瑶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Science and Technology
Original Assignee
Harbin University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Science and Technology filed Critical Harbin University of Science and Technology
Priority to CN202110270518.7A priority Critical patent/CN113032613B/en
Publication of CN113032613A publication Critical patent/CN113032613A/en
Application granted granted Critical
Publication of CN113032613B publication Critical patent/CN113032613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides a three-dimensional model retrieval method based on an interactive attention convolution neural network. The method comprises the steps of preprocessing a three-dimensional model, fixing a projection angle to obtain 6 views of the three-dimensional model, and converting the views into line graphs to serve as a view set of the three-dimensional model. Secondly, an interaction attention module is embedded in the convolutional neural network to extract semantic features, and data interaction between two network layers of the convolutional neural network is increased. And extracting global features by using a Gist algorithm and a two-dimensional shape distribution algorithm. Thirdly, calculating the similarity between the sketch and the two-dimensional view by using the Euclidean distance. These features are then combined with the weights to retrieve the three-dimensional model. The method makes up the problem of inaccurate semantic features caused by overfitting when a neural network is trained by using small sample data, and improves the accuracy of three-dimensional model retrieval.

Description

Three-dimensional model retrieval method based on interactive attention convolution neural network
The technical field is as follows:
the invention relates to a three-dimensional model retrieval method based on an interactive attention convolution neural network, which is well applied to the field of three-dimensional model retrieval.
Background art:
in recent years, with the increasing development of science and technology, three-dimensional models have important roles not only in many professional fields but also widely spread in daily life of people, and the demand of people for searching three-dimensional models is gradually increased. The test objects of the example-based three-dimensional model retrieval can only be models in a database, so that certain universality is lacked. The three-dimensional model retrieval based on the sketch can be drawn at will according to the requirements of users, is convenient and applicable, and has wide prospects.
Currently, some common algorithms use a single manual feature or deep-learning algorithm pair to solve the sketch-based model retrieval problem. However, the traditional manual features have defects, researchers need a large amount of prior knowledge, the setting of parameters needs to be manually set in advance, and the extracted feature effect is not imagined. The parameters can be automatically adjusted by using a deep learning algorithm, so that the method has good expansibility. But it also has certain drawbacks. Because the number of nodes of the deep neural network is large, a large amount of data is needed to train the neural network to obtain excellent results, and once the training data amount is insufficient, overfitting is caused, and the obtained results are also deviated. In order to obtain a better retrieval result on the premise of insufficient training samples, the invention provides a three-dimensional model retrieval method based on an interactive attention convolution neural network.
The invention content is as follows:
the invention discloses a three-dimensional model retrieval method based on an interactive attention convolution neural network, aiming at solving the problem that the retrieval effect of a three-dimensional model retrieval method based on a sketch is poor on the premise of insufficient training samples.
Therefore, the invention provides the following technical scheme:
1. a three-dimensional model retrieval method based on an interactive attention convolution neural network is characterized by comprising the following steps:
step 1: and carrying out data preprocessing, projecting the three-dimensional model to obtain a plurality of views corresponding to the three-dimensional model, and obtaining an edge view set of the model by using an edge detection algorithm.
Step 2: designing a deep convolutional neural network, and optimizing a network model by using an interactive attention module. And selecting one part of the view sets as a training set, and the other part of the view sets as a test set.
And step 3: the training includes two processes, forward propagation and backward propagation. And training the training data serving as input of the interactive attention convolution neural network model to obtain the optimized interactive attention convolution neural network model.
And 4, step 4: and respectively extracting semantic features of the freehand sketch and the model view by using the optimized interactive attention convolution neural network model and the gist feature, and respectively extracting two-dimensional shape distribution features of the freehand sketch and the model view by using the two-dimensional shape distribution feature.
And 5: and fusing the plurality of features in a weighting mode. And retrieving the model which is most similar to the hand-drawn sketch according to the Euclidean distance.
2. The method for retrieving a three-dimensional model based on an interactive attention convolutional neural network as claimed in claim 1, wherein in the step 1, the three-dimensional model is projected to obtain a plurality of views corresponding to the three-dimensional model and an edge detection algorithm is used to obtain an edge view set of the model, and the specific steps are as follows:
step 1-1, setting a three-dimensional model at the center of a virtual sphere;
step 1-2, placing a virtual camera above the model, and rotating the model by 360 degrees by 30 degrees in each step to obtain 12 view sets of the three-dimensional model;
1-3, obtaining respective edge views of 12 original view sets by using a Canny edge detection algorithm;
after the three-dimensional model is projected, the three-dimensional model is characterized into a group of two-dimensional views, and the semantic gap between the hand-drawn sketch and the three-dimensional model view can be reduced by using a Canny edge detection algorithm.
3. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 2, the deep convolutional neural network is designed, and the network model is optimized by using the interactive attention module, and the specific steps are as follows:
step 2-1, determining the depth of a convolutional neural network, the size of a convolutional kernel, and the number of convolutional layers and pooling layers;
step 2-2 design Interactive attention Module, in volumeThe output of the build-up layer is connected with a global pooling layer to obtain the conv of the build-up layernAmount of information Z in each channelkThe information amount calculation formula is as follows:
Figure BDA0002974166900000031
wherein, convnkA kth feature map of size W representing the output of the nth convolutional layern*Hn
Step 2-3, connecting two full connection layers after the global pooling layer, and adaptively adjusting the attention weight S of each channel according to the information quantityknThe weight is calculated as follows:
Skn=Fex(Z,W)=σ(g(Z,W))=σ(W2δ(W1Z))
wherein, delta is a Relu function, and sigma is a sigmoid function. W1、W2The weights of the first full connection and the second full connection, respectively.
Step 2-4 calculating the interactive attention weight S of two neighborhood convolution layers respectivelyk1And Sk2And fusing the data to obtain the optimal attention weight SkThe calculation formula of the optimal attention weight is as follows:
Sk=Average(Sk1,Sk2)
step 2-5 will note the weight SkAnd second convolution layer conv2The first pooling layer apFusing to obtain final result a2The fused calculation formula is as follows:
Figure BDA0002974166900000032
and selecting one part of the view sets as a training set, and the other part of the view sets as a test set.
4. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 3, the convolutional neural network model is trained, and the specific steps are as follows:
step 3-1, inputting training data into an initialized interactive attention convolution neural network model;
step 3-2, extracting more detailed view characteristics through the convolutional layers, extracting low-level characteristics through the shallow-level convolutional layers, and extracting high-level semantic characteristics through the high-level convolutional layers;
3-3, after the attention module is fused with the neighborhood convolution layer through the weighting channel, reducing information lost when the edge view of the hand-drawn sketch or the model is pooled;
step 3-4, reducing the scale of the view characteristics through a pooling layer, thereby reducing the number of parameters and accelerating the speed of model calculation;
step 3-5, passing through a Dropout layer, reducing the overfitting problem caused by insufficient training samples;
3-6, after alternately operating convolution, an attention module, Dropout and pooling, finally inputting a full connection layer, and reducing the dimension of the extracted features to connect the extracted features into a one-dimensional high-level semantic feature vector;
steps 3-7 use the labeled 2D view to optimize the weights and biases of the interactive attention convolutional neural network in the back propagation process. The 2D view-set is { v }1,v2,…,vnIs set of { l } labels1,l2,…,ln}. The 2D view has t classes including 1,2, …, t. After forward propagation, viThe prediction probability in class j is y _ testj. V is to beiLabel l ofiComparing with the category j, calculating the expected probability yijThe formula for calculating the probability is as follows:
Figure BDA0002974166900000041
step 3-8 will predict the probability y _ testijAnd true probability yjA comparison is made and the error loss is calculated using a cross entropy loss function.
The error loss is calculated as follows:
Figure BDA0002974166900000051
and continuously iterating the interactive attention convolution neural network model to obtain an optimized interactive attention convolution neural network model, and storing the weight and the bias.
5. The method for retrieving the three-dimensional model based on the interactive attention convolution neural network as claimed in claim 1, wherein in the step 4, the optimized interactive attention convolution neural network model and the gist feature are used to extract semantic features of the freehand sketch and the model view respectively, and the two-dimensional shape distribution feature of the freehand sketch and the model view is extracted respectively by using the two-dimensional shape distribution feature, and the specific process is as follows:
step 4-1, inputting test data into the optimized interactive attention convolution neural network model;
and 4-2, extracting the features of the full connection layer to be used as high-level semantic features of the hand-drawn sketch or the model view.
Step 4-3 divides the sketch or 2D view of size m x n into 4 x 4 blocks. The size of each block is a b, where a is m/4 and b is n/4.
Step 4-4 each block is processed by 32 Gabor filters of 4 scales, 8 directions. And combining the processed features to obtain gist features. The formula is as follows:
Figure BDA0002974166900000052
wherein i is 4 and j is 8. G (x, y) is the gist feature of the 32 Gabor filters, and cat () represents the stitching operation. Here, x and y are positions of pixels, and I (x, y) denotes a block. At the same time, gij(x, y) are filters for the ith scale and the jth direction. Denotes convolution operation.
Step 4-5, randomly and equidistantly sampling points on the boundary of the sketch or the 2D view, and collecting the points as { (x)1,y1),…,(xi,yi),…,(xn,yn)}. Here (x)i,yi) Are point coordinates.
Steps 4-6 use the D1 descriptor to represent the distance between the centroid and the random sample point on the boundary of the sketch or two-dimensional view. Extracting dots from the dots, and collecting PD1 ═ ai1,…,aik,…,aiN}. D1 set of shape distribution features as { D1_ v1,…,D1_vi,…,D1_vBins}. Wherein D1_ viIs a statistic of intervals (BinsSize × (i-1), BinsSize ×, i), Bins is the number of intervals, and BinsSize is the length of the intervals. D1_ viThe calculation formula of (a) is as follows:
D1_vi=|{P|dist(P,O)∈(BinSize*(i-1),BinSize*i),P∈PD1}|
where BinsSize ═ max ({ dist (P, O) | P ∈ PD1})/N, dist () is the euclidean distance between two points. O is the centroid of the sketch or 2D view.
Steps 4-7 describe the distance between two random sample points on the boundary of the sketch or two-dimensional view using the D2 descriptor. Extracting point pairs from the points, and collecting the point pairs as PD2 { (ai)1,bi1),(ai2,bi2),…,(aiN,biN)}. D2 set of shape distribution features as { D2_ v1,…,D2_vi,…,D2_vBins}. Here, D2_ viStatistics in the intervals (BinSize × (i-1), BinSize × (i)) are represented. D2_ viThe calculation formula is as follows:
D2_vi=|{P|dist(P)∈(BinSize*(i-1),BinSize*i),P∈PD2}|
where BinsSize ═ max ({ dist (P) | P ∈ PD2 })/N.
Steps 4-8 utilize the D3 descriptor to describe the square root of the area formed by the 3 random sample points on the boundary of the sketch or 2D view. Extracting point triplets from the points and collecting PD3 { (ai)1,bi1,ci1),(ai2,bi2,ci2),…,(ain,bin,cin)}. D3 set of shape distribution features as { D3_ v1,…,D3_vi,…,D3_vBins}. Here, D3_ viRepresents statistical information in the section (BinSize (i-1), BinSize i)。D3_viThe calculation formula is as follows:
D3_vi=|{P|herson(P)∈(BinSize*(i-1),BinSize*i),P∈PD3}|
wherein the content of the first and second substances,
Figure BDA0002974166900000061
herson () stands for helln formula, and calculates the triangle P ═ P (P) using helln formula1,P2,P3) The calculation formula is as follows:
Figure BDA0002974166900000062
Figure BDA0002974166900000063
wherein a ═ dist (P)1,P2),b=dist(P1,P3),c=dist(P2,P3).
Step 4-9D1_ vi,D2_vi,D3_viThe connection forms a shape distribution characteristic, i ═ 1,2, …, Bins.
6. The three-dimensional model retrieval method based on interactive attention CNN and weighted similarity calculation of claim 1, wherein in the step 5, a plurality of features are fused, and a model most similar to a hand-drawn sketch is retrieved according to a similarity measurement formula by the following specific processes:
step 5-1, selecting Euclidean distance as a similarity measurement method;
and 5-2, extracting feature vectors from the two-dimensional view and the sketch by using the improved interactive attention convolution neural network, and normalizing the feature vectors. Calculating similarity by using Euclidean distance, marking as distance1, and calculating the accuracy of retrieval, and marking as t 1;
and 5-3, extracting the feature vectors of the sketch and the model view by using the gist features, and normalizing the feature vectors. Calculating similarity by using Euclidean distance, marking as distance2, and calculating the accuracy of retrieval, and marking as t 2;
and 5-4, extracting a feature vector between the sketch and the model view by using the two-dimensional shape distribution feature, and performing normalization processing on the feature vector. Calculating similarity by using Euclidean distance, marking as distance3, and calculating the accuracy of retrieval, and marking as t 3;
and 5-5, comparing the accuracy of the three features, and performing weighted fusion on the features to form new feature similarity sim (distance). The formula is as follows:
Sim(distance)=w1*distance1+w2*distance2+w3*distance,w1+w2+w3=1
wherein, w1=t1/(t1+t2+t3),w2=t2/(t1+t2+t3),w3=t3/(t1+t2+t3)
And 5-6, sorting according to the similarity from small to large to realize the retrieval effect.
Has the advantages that:
1. the invention discloses a three-dimensional model retrieval method based on an interactive attention convolution neural network. Model search was performed based on the SHREC13 database and the ModelNet40 database. Experimental results show that the method has high accuracy.
2. The retrieval model used by the invention is an interactive attention module and a convolutional neural network model, and the convolutional neural network has the capability of local perception and parameter sharing, can well process high-dimensional data and does not need to manually select data characteristics. The proposed interaction attention model combines the attention weights of two adjacent convolutional layers to realize the interaction of data between two network layers. And a better retrieval effect can be obtained by the trained convolutional neural network model.
3. And when the model is trained, updating parameters by adopting a random gradient descent method. The error returns along the original route through back propagation, namely, each layer of parameters are updated layer by layer from the output layer through each middle hidden layer in the reverse direction, and finally the error returns to the output layer. Forward and backward propagation are continuously performed to reduce errors and update model parameters until the CNN is trained.
4. The invention improves the distribution characteristics of the three-dimensional shape, so that the method is suitable for sketch and two-dimensional view. Shape information of the sketch and the three-dimensional model view is described using a shape distribution function.
5. The invention adopts a mode of self-adaptive fusion of various characteristics to perform similarity fusion on the provided characteristics, thereby realizing better retrieval effect.
Description of the drawings:
fig. 1 is a sketch to be retrieved in an embodiment of the present invention.
Fig. 2 is a three-dimensional model search framework diagram according to an embodiment of the present invention.
FIG. 3 is a projection view of a model in an embodiment of the invention.
Fig. 4 is a Canny edge view in an embodiment of the invention.
FIG. 5 is a model of an interactive attention convolutional neural network in an embodiment of the present invention.
FIG. 6 illustrates a training process of the Interactive attention convolutional neural network in an embodiment of the present invention.
FIG. 7 illustrates a testing process of the Interactive attention convolutional neural network in an embodiment of the present invention.
The specific implementation mode is as follows:
in order to clearly and completely describe the technical solutions in the embodiments of the present invention, the present invention is further described in detail below with reference to the drawings in the embodiments.
The invention uses the sketch of SHREC13 and the data of model Net40 model base to carry out experimental verification. Take "17205. png" in the SHREC13 sketch and "table _0399. off" in the model library of ModelNet40 as examples. The sketch to be retrieved is shown in fig. 1.
The experimental frame diagram of the three-dimensional model retrieval method based on the interactive attention convolution neural network is implemented, as shown in fig. 2, and comprises the following steps:
step 1, projecting the three-dimensional model to obtain a three-dimensional model edge view set, which specifically comprises the following steps:
step 1-1, a table _0399.off file is placed in the center of a virtual sphere.
Step 1-2, placing a virtual camera above the model, and rotating the model by 360 degrees at each step by 30 degrees, so as to obtain 12 view sets of the three-dimensional model, wherein one view is taken as an example for display, and the projection view of the model is shown in fig. 3;
the views obtained by steps 1-3 using the Canny edge detection algorithm are shown in fig. 4;
step 2, designing a deep convolutional neural network, and optimizing a network model by using an interactive attention module, as shown in fig. 5, specifically:
and 2-1, designing a deep convolutional neural network for better characteristic extraction effect, wherein the deep convolutional neural network comprises 5 convolutional layers, 4 pooling layers, two dropout layers, a connecting layer and a full connecting layer.
Step 2-2, embedding the interactive attention module into the designed convolutional neural network structure, connecting a global pooling layer after the output of the convolutional layer, and solving the information quantity Z of each channel in the convolutional layerk. Taking the sketch as an example, the first convolution layer information amount of the sketch is as follows:
Zk=[[0.0323739 0.04996519 0.0190248 0.03274497 0.03221277 0.00206719 0.04075038 0.01613641 0.03390235 0.04024649 0.03553107 0.00632962 0.03442683 0.04588291 0.01900478 0.02144121 0.03710039 0.03861086 0.05596253 0.0439686 0.03611921 0.04850776 0.00716817 0.02596463 0.00525256 0.03657651 0.02809189 0.03490375 0.04528182 0.03938764 0.00690786 0.04449471]]
step 2-3, connecting two full connection layers after the global pooling layer, and adaptively adjusting the attention weight S of each channel according to the information quantitykn. Taking the sketch as an example, the attention weights of the sketch are as follows:
Skn=[[0.49450904 0.49921992 0.50748134 0.5051483 0.5093386 0.49844238 0.50426346 0.50664175 0.5053692 0.5012332 0.5004162 0.49788538 0.505669 0.5012219 0.5009724 0.4942028 0.49796405 0.4992011 0.5064934 0.4963113 0.50500274 0.50238824 0.50202376 0.49661288 0.50185806 0.5048757 0.5073203 0.50703263 0.51684725 0.50641936 0.5052296 0.4979179]]
step 2-4 calculating the interactive attention weight S of two neighborhood convolution layers respectivelyk1And Sk2And fusing the data to obtain the optimal attention weight SkThe optimal attention weight for the sketch is as follows:
Sk=[[0.4625304 0.47821882 0.5064253 0.5032532 0.5093386 0.49877496 0.50426346 0.50664175 0.5053692 0.5012332 0.5004162 0.49784237 0.505688 0.5011142 0.5008647 0.4942028 0.49796405 0.4991069 0.5064934 0.4963113 0.5102687 0.50125698 0.502524856 0.49675384 0.49365704 0.5027958 0.5076529 0.50814523 0.51006527 0.50361942 0.50422731 0.4635842]]
step 2-5 will note the weight SkAnd second convolution layer conv2The first pooling layer apFusing to obtain final result a2Partial results for the second convolution layer of the sketch are:
a2=[[[[0.14450312 0.0644969 0.10812703...0.18608719 0.01994037 0]
[0.18341058 0.15881275 0.24716881...0.18875208 0.14420813 0.08290599]
[0.17390229 0.14937611 0.2255666...0.15295741 0.18792515 0.08066748]
...
[0.31344187 0.18656467 0.22178406...0.22087486 0.22130579 0.00955889]
[0.12405898 0.10548315 0.11685486...0.10439464 0.2906406 0.14846338]]
[[0.10032222 0.21919143 0.09797319...0.13584027 0.0.12112971]
[0.20946684 0.14252397 0.17954415...0.09708451 0.0.15463363]
[0.06941956 0.03963253 0.13273408...0.00173131 0.04566149 0.14895247]
...
[[0.01296724 0.27460644 0.09022377...0.06938899 0.04487894 0.2567152]
[0.16118288 0.38024116 0.02033611...0.13374138 0 0.17068687]
[0.09430372 0.35878736 0...0.0846955 0 0.25289127]
...
[0.10363265 0.4103881 0...0.0728834 0 0.29586816]
[0.18578637 0.34666267 0...0.05323519 0 0.27042198]
[0.0096841 0.18718664 0...0.04646093 0.00576336 0.155898]]]]
step 3, training the convolutional neural network model, as shown in fig. 6, specifically comprising the steps of:
step 3-1, inputting the sketch and the edge two-dimensional view into an initialized interactive attention convolution neural network as training data;
step 3-2, extracting more detailed view characteristics through the convolution layer;
3-3, after the attention module is fused with the neighborhood convolution layer through the weighting channel, the information lost when the edge view of the hand-drawn sketch or the model is pooled can be reduced;
step 3-4, extracting the maximum view information through a pooling layer;
step 3-5, passing through a Dropout layer, reducing the overfitting problem caused by insufficient training samples;
3-6, after alternately operating convolution, an attention module, Dropout and pooling, finally inputting a full connection layer, and reducing the dimension of the extracted features to connect the extracted features into a one-dimensional high-level semantic feature vector;
steps 3-7 learned by the softmax function that the probability of the sketch "17205. png" under the "table" category is 89.99%
Step 3-8 will predict the probability y _ testijAnd true probability yjA comparison is made and the error loss is calculated using a cross entropy loss function.
Figure BDA0002974166900000121
Wherein loss17205Error of the sketch "17205. png".
And continuously iterating the interactive attention convolution neural network model to obtain the optimized interactive attention convolution neural network model.
Step 4, extracting semantic features and shape distribution features, specifically:
step 4-1, inputting the test data into the optimized interactive attention convolution neural network model, wherein the test process is shown in FIG. 7;
and 4-2, extracting the features of the full connection layer to be used as high-level semantic features of the hand-drawn sketch or the model view. Part of the high-level semantic features of the extracted sketch are as follows:
Feature=[[0,0.87328064,0,0,1.3293583,0,2.3825126,0,0,4.8035927,0,1.5186063,0,3.6845286,1.0825952,0,1.8516512,1.0285587,0,0,0,3.3322043,1.0545557,0,0,4.8707848,3.042554,0,0,0,0,6.8227463,2.537525,1.5318785,2.7271123,0,3.0482264……]]
step 4-3 dividing the size sketch or two-dimensional view into 4 x 4 blocks;
step 4-4 each block is processed by 32 Gabor filters of 4 scales, 8 directions. And combining the processed features to obtain gist features. The Gist feature is extracted 512 dimensions, and the partial Gist feature of the sketch is as follows:
G(x,y)=[[5.81147151e-03 1.51588341e-02 1.75721212e-03 2.10059434e-01 1.62918585e-01 1.54040498e-01 1.44374291e-01 8.71880878e-01 5.26758657e-01 4.14263371e-01 7.17606844e-01 6.22190594e-01 1.11205845e-01 7.69002490e-04 2.18182730e-01 2.29565939e-01 9.32599080e-03 1.10805327e-02 1.40071468e-03 2.58543039e-01 5.67934220e-02 1.06132064e-01 9.10082146e-02 4.02163211e-01 2.97883778e-01 2.45860956e-01 4.02066928e-01 2.84401506e-01
1.03228724e-01 6.37419945e-04 2.71290458e-01……]]
step 4-5, randomly and equidistantly sampling points on the boundary of the sketch or the two-dimensional view;
steps 4-6 use the D1 descriptor to represent the distance between the centroid and the random sample point on the boundary of the sketch or two-dimensional view. The portion D1 descriptor of the sketch is as follows:
D1=[0.30470497858541628,0.6256941275550102,0.11237884569183111,0.23229854666522,0.2657159486944761,0.0731852015843772,0.40751749800795261……]
steps 4-7 describe the distance between two random sample points on the boundary of the sketch or two-dimensional view using the D2 descriptor. The portion D2 descriptor of the sketch is as follows:
D2=[0.13203683803844625,0.028174099301372796,0.15392681513105217,0.130238265264,0.123460163767958,0.06985106421513015,0.12992235205980568……]
steps 4-8 utilize the D3 descriptor to describe the square root of the area formed by the 3 random sample points on the boundary of the sketch or two-dimensional view. The portion D3 descriptor of the sketch is as follows:
D3=[0.9193157274532394,0.5816923854309814,0.46980644879802125,0.498873567635874,0.7195175116705602,0.29425190983247506,0.8724092377243926……]
step 4-9, connecting D1, D2 and D3 in series to form a shape distribution characteristic;
and 5, fusing a plurality of characteristics of the sketch, and retrieving a model most similar to the hand-drawn sketch according to the similarity measurement formula, wherein the method specifically comprises the following steps:
step 5-1, comparing various similarity retrieval methods, wherein the final effect is best in Euclidean distance;
and 5-2, extracting feature vectors from the two-dimensional view and the sketch by using the improved interactive attention convolution neural network, and normalizing the feature vectors. Calculating the similarity by using the Euclidean distance, marking as distance1, and the retrieval accuracy is 0.96;
and 5-3, extracting the feature vectors of the sketch and the model view by using the gist features, and normalizing the feature vectors. Calculating the similarity by using the Euclidean distance, marking as distance2, and the retrieval accuracy is 0.53;
and 5-4, extracting a feature vector between the sketch and the model view by using the two-dimensional shape distribution feature, and performing normalization processing on the feature vector. Calculating the similarity by using the Euclidean distance, marking as distance3, and the accuracy of retrieval is 0.42;
and 5-5, determining the weight according to the retrieval accuracy of the three characteristics.
Figure BDA0002974166900000151
Figure BDA0002974166900000152
Figure BDA0002974166900000153
The final weight is determined as: 5:3:2
Sim(distance)=0.5*distance1+0.3*distance2+0.2*distance
And 5-6, sorting according to the similarity from small to large to realize the retrieval effect.
The three-dimensional model retrieval method based on the interactive attention convolution neural network in the embodiment of the invention adopts a traditional characteristic and depth characteristic weighting fusion mode, thereby realizing better retrieval effect.
The foregoing is a detailed description of embodiments of the invention, taken in conjunction with the accompanying drawings, wherein the specific embodiments are merely provided to assist in understanding the method of the invention. For those skilled in the art, the invention can be modified and adapted within the scope of the embodiments and applications according to the spirit of the present invention, and therefore the present invention should not be construed as being limited thereto.

Claims (6)

1. A three-dimensional model retrieval method based on an interactive attention convolution neural network is characterized by comprising the following steps:
step 1: and carrying out data preprocessing, projecting the three-dimensional model to obtain a plurality of views corresponding to the three-dimensional model, and obtaining an edge view set of the model by using an edge detection algorithm.
Step 2: designing a deep convolutional neural network, and optimizing a network model by using an interactive attention module. And selecting one part of the view sets as a training set, and the other part of the view sets as a test set.
And step 3: the training includes two processes, forward propagation and backward propagation. And training the training data serving as input of the interactive attention convolution neural network model to obtain the optimized interactive attention convolution neural network model.
And 4, step 4: and respectively extracting semantic features of the freehand sketch and the model view by using the optimized interactive attention convolution neural network model and the gist feature, and respectively extracting two-dimensional shape distribution features of the freehand sketch and the model view by using the two-dimensional shape distribution feature.
And 5: and fusing the plurality of features in a weighting mode. And retrieving the model which is most similar to the hand-drawn sketch according to the Euclidean distance.
2. The method for retrieving a three-dimensional model based on an interactive attention convolutional neural network as claimed in claim 1, wherein in the step 1, the three-dimensional model is projected to obtain a plurality of views corresponding to the three-dimensional model and an edge detection algorithm is used to obtain an edge view set of the model, and the specific steps are as follows:
step 1-1, setting a three-dimensional model at the center of a virtual sphere;
step 1-2, placing a virtual camera above the model, and rotating the model by 360 degrees by 30 degrees in each step to obtain 12 view sets of the three-dimensional model;
1-3, obtaining respective edge views of 12 original view sets by using a Canny edge detection algorithm;
after the three-dimensional model is projected, the three-dimensional model is characterized into a group of two-dimensional views, and the semantic gap between the hand-drawn sketch and the three-dimensional model view can be reduced by using a Canny edge detection algorithm.
3. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 2, the deep convolutional neural network is designed, and the network model is optimized by using the interactive attention module, and the specific steps are as follows:
step 2-1, determining the depth of a convolutional neural network, the size of a convolutional kernel, and the number of convolutional layers and pooling layers;
step 2-2, designing an interactive attention module, connecting a global pooling layer after the output of the convolutional layer, and solving the conv of the convolutional layernAmount of information Z in each channelkThe information amount calculation formula is as follows:
Figure FDA0002974166890000021
wherein, convnkA kth feature map of size W representing the output of the nth convolutional layern*Hn
Step 2-3, connecting two full connection layers after the global pooling layer, and adaptively adjusting the attention weight S of each channel according to the information quantityknThe weight is calculated as follows:
Skn=Fex(Z,W)=σ(g(Z,W))=σ(W2δ(W1Z))
wherein, delta is a Relu function, and sigma is a sigmoid function. W1、W2The weights of the first full connection and the second full connection, respectively.
Step 2-4 calculating the interactive attention weight S of two neighborhood convolution layers respectivelyk1And Sk2And fusing the data to obtain the optimal attention weight SkThe calculation formula of the optimal attention weight is as follows:
Sk=Average(Sk1,Sk2)
step 2-5 will note the weight SkAnd second convolution layer conv2The first pooling layer apFusing to obtain final result a2The fused calculation formula is as follows:
Figure FDA0002974166890000022
and selecting one part of the view sets as a training set, and the other part of the view sets as a test set.
4. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 3, the convolutional neural network model is trained, and the specific steps are as follows:
step 3-1, inputting training data into an initialized interactive attention convolution neural network model;
step 3-2, extracting more detailed view characteristics through the convolutional layers, extracting low-level characteristics through the shallow-level convolutional layers, and extracting high-level semantic characteristics through the high-level convolutional layers;
3-3, after the attention module is fused with the neighborhood convolution layer through the weighting channel, reducing information lost when the edge view of the hand-drawn sketch or the model is pooled;
step 3-4, reducing the scale of the view characteristics through a pooling layer, thereby reducing the number of parameters and accelerating the speed of model calculation;
step 3-5, passing through a Dropout layer, reducing the overfitting problem caused by insufficient training samples;
3-6, after alternately operating convolution, an attention module, Dropout and pooling, finally inputting a full connection layer, and reducing the dimension of the extracted features to connect the extracted features into a one-dimensional high-level semantic feature vector;
steps 3-7 use the labeled 2D view to optimize the weights and biases of the interactive attention convolutional neural network in the back propagation process. The 2D view-set is { v }1,v2,…,vnIs set of { l } labels1,l2,…,ln}. The 2D view has t classes including 1,2, …, t. After forward propagation, viThe prediction probability in class j is y _ testj. V is to beiLabel l ofiComparing with the category j, calculating the expected probability yijThe formula for calculating the probability is as follows:
Figure FDA0002974166890000031
step 3-8 will predict the probability y _ testijAnd true probability yjThe comparison is carried out in such a way that,the error loss is calculated using a cross entropy loss function.
The error loss is calculated as follows:
Figure FDA0002974166890000041
and continuously iterating the interactive attention convolution neural network model to obtain an optimized interactive attention convolution neural network model, and storing the weight and the bias.
5. The method for retrieving the three-dimensional model based on the interactive attention convolution neural network as claimed in claim 1, wherein in the step 4, the optimized interactive attention convolution neural network model and the gist feature are used to extract semantic features of the freehand sketch and the model view respectively, and the two-dimensional shape distribution feature of the freehand sketch and the model view is extracted respectively by using the two-dimensional shape distribution feature, and the specific process is as follows:
step 4-1, inputting test data into the optimized interactive attention convolution neural network model;
and 4-2, extracting the features of the full connection layer to be used as high-level semantic features of the hand-drawn sketch or the model view.
Step 4-3 divides the sketch or 2D view of size m x n into 4 x 4 blocks. The size of each block is a b, where a is m/4 and b is n/4.
Step 4-4 each block is processed by 32 Gabor filters of 4 scales, 8 directions. And combining the processed features to obtain gist features. The formula is as follows:
Figure FDA0002974166890000042
wherein i is 4 and j is 8. G (x, y) is the gist feature of the 32 Gabor filters, and cat () represents the stitching operation. Here, x and y are positions of pixels, and I (x, y) denotes a block. At the same time, gij(x, y) are filters for the ith scale and the jth direction. Representing convolution operationAnd (4) calculating.
Step 4-5, randomly and equidistantly sampling points on the boundary of the sketch or the 2D view, and collecting the points as { (x)1,y1),…,(xi,yi),…,(xn,yn)}. Here (x)i,yi) Are point coordinates.
Steps 4-6 use the D1 descriptor to represent the distance between the centroid and the random sample point on the boundary of the sketch or two-dimensional view. Extracting dots from the dots, and collecting PD1 ═ ai1,…,aik,…,aiN}. D1 set of shape distribution features as { D1_ v1,…,D1_vi,…,D1_vBins}. Wherein D1_ viIs a statistic of intervals (BinsSize × (i-1), BinsSize ×, i), Bins is the number of intervals, and BinsSize is the length of the intervals. D1_ viThe calculation formula of (a) is as follows:
D1_vi=|{P|dist(P,O)∈(BinSize*(i-1),BinSize*i),P∈PD1}|
where BinsSize ═ max ({ dist (P, O) | P ∈ PD1})/N, dist () is the euclidean distance between two points. O is the centroid of the sketch or 2D view.
Steps 4-7 describe the distance between two random sample points on the boundary of the sketch or two-dimensional view using the D2 descriptor. Extracting point pairs from the points, and collecting the point pairs as PD2 { (ai)1,bi1),(ai2,bi2),…,(aiN,biN)}. D2 set of shape distribution features as { D2_ v1,…,D2_vi,…,D2_vBins}. Here, D2_ viStatistics in the intervals (BinSize × (i-1), BinSize × (i)) are represented. D2_ viThe calculation formula is as follows:
D2_vi=|{P|dist(P)∈(BinSize*(i-1),BinSize*i),P∈PD2}|
where BinsSize ═ max ({ dist (P) | P ∈ PD2 })/N.
Steps 4-8 utilize the D3 descriptor to describe the square root of the area formed by the 3 random sample points on the boundary of the sketch or 2D view. Extracting point triplets from the points and collecting PD3 { (ai)1,bi1,ci1),(ai2,bi2,ci2),…,(ain,bin,cin)}. D3 set of shape distribution features as { D3_ v1,…,D3_vi,…,D3_vBins}. Here, D3_ viThe statistical information in the intervals (BinSize × (i-1), BinSize × i) is shown. D3_ vi
D3_vi=|{P|herson(P)∈(BinSize*(i-1),BinSize*i),P∈PD3}|
Wherein the content of the first and second substances,
Figure FDA0002974166890000051
herson () stands for helln formula, and calculates the triangle P ═ P (P) using helln formula1,P2P3), the calculation formula is as follows:
Figure FDA0002974166890000052
Figure FDA0002974166890000053
wherein a ═ dist (P)1,P2),b=dist(P1,P3),c=dist(P2,P3).
Steps 4-9D1_ vi, D2_ vi, D3_ vi are concatenated to form a shape distribution feature, i ═ 1,2, …, Bins.
6. The three-dimensional model retrieval method based on interactive attention CNN and weighted similarity calculation of claim 1, wherein in the step 5, a plurality of features are fused, and a model most similar to a hand-drawn sketch is retrieved according to a similarity measurement formula by the following specific processes:
step 5-1, selecting Euclidean distance as a similarity measurement method;
and 5-2, extracting feature vectors from the two-dimensional view and the sketch by using the improved interactive attention convolution neural network, and normalizing the feature vectors. Calculating similarity by using Euclidean distance, marking as distance1, and calculating the accuracy of retrieval, and marking as t 1;
and 5-3, extracting the feature vectors of the sketch and the model view by using the gist features, and normalizing the feature vectors. Calculating similarity by using Euclidean distance, marking as distance2, and calculating the accuracy of retrieval, and marking as t 2;
and 5-4, extracting a feature vector between the sketch and the model view by using the two-dimensional shape distribution feature, and performing normalization processing on the feature vector. Calculating similarity by using Euclidean distance, marking as distance3, and calculating the accuracy of retrieval, and marking as t 3;
and 5-5, comparing the accuracy of the three features, and performing weighted fusion on the features to form a new feature similarity distance. The formula is as follows:
Sim(distance)=w1*distance1+w2*distance2+w3*distance,w1+w2+w3=1
wherein, w1=t1/(t1+t2+t3),w2=t2/(t1+t2+t3),w3=t3/(t1+t2+t3)
And 5-6, sorting according to the similarity from small to large to realize the retrieval effect.
CN202110270518.7A 2021-03-12 2021-03-12 Three-dimensional model retrieval method based on interactive attention convolution neural network Active CN113032613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110270518.7A CN113032613B (en) 2021-03-12 2021-03-12 Three-dimensional model retrieval method based on interactive attention convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110270518.7A CN113032613B (en) 2021-03-12 2021-03-12 Three-dimensional model retrieval method based on interactive attention convolution neural network

Publications (2)

Publication Number Publication Date
CN113032613A true CN113032613A (en) 2021-06-25
CN113032613B CN113032613B (en) 2022-11-08

Family

ID=76470237

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110270518.7A Active CN113032613B (en) 2021-03-12 2021-03-12 Three-dimensional model retrieval method based on interactive attention convolution neural network

Country Status (1)

Country Link
CN (1) CN113032613B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113658176A (en) * 2021-09-07 2021-11-16 重庆科技学院 Ceramic tile surface defect detection method based on interactive attention and convolutional neural network
CN114842287A (en) * 2022-03-25 2022-08-02 中国科学院自动化研究所 Monocular three-dimensional target detection model training method and device of depth-guided deformer

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101004748A (en) * 2006-10-27 2007-07-25 北京航空航天大学 Method for searching 3D model based on 2D sketch
CN101089846A (en) * 2006-06-16 2007-12-19 国际商业机器公司 Data analysis method, equipment and data analysis auxiliary method
CN101110826A (en) * 2007-08-22 2008-01-23 张建中 Method, device and system for constructing multi-dimensional address
CN101350016A (en) * 2007-07-20 2009-01-21 富士通株式会社 Device and method for searching three-dimensional model
CN103295025A (en) * 2013-05-03 2013-09-11 南京大学 Automatic selecting method of three-dimensional model optimal view
CN105243137A (en) * 2015-09-30 2016-01-13 华南理工大学 Draft-based three-dimensional model retrieval viewpoint selection method
CN107122396A (en) * 2017-03-13 2017-09-01 西北大学 Three-dimensional model searching algorithm based on depth convolutional neural networks
US20180039856A1 (en) * 2016-08-04 2018-02-08 Takayuki Hara Image analyzing apparatus, image analyzing method, and recording medium
CN109783887A (en) * 2018-12-25 2019-05-21 西安交通大学 A kind of intelligent recognition and search method towards Three-dimension process feature
CN110033023A (en) * 2019-03-11 2019-07-19 北京光年无限科技有限公司 It is a kind of based on the image processing method and system of drawing this identification
CN110569386A (en) * 2019-09-16 2019-12-13 哈尔滨理工大学 Three-dimensional model retrieval method based on hand-drawn sketch integrated descriptor
CN111078913A (en) * 2019-12-16 2020-04-28 天津运泰科技有限公司 Three-dimensional model retrieval method based on multi-view convolution neural network
CN111242207A (en) * 2020-01-08 2020-06-05 天津大学 Three-dimensional model classification and retrieval method based on visual saliency information sharing
CN111597367A (en) * 2020-05-18 2020-08-28 河北工业大学 Three-dimensional model retrieval method based on view and Hash algorithm
CN111625667A (en) * 2020-05-18 2020-09-04 北京工商大学 Three-dimensional model cross-domain retrieval method and system based on complex background image

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101089846A (en) * 2006-06-16 2007-12-19 国际商业机器公司 Data analysis method, equipment and data analysis auxiliary method
CN101004748A (en) * 2006-10-27 2007-07-25 北京航空航天大学 Method for searching 3D model based on 2D sketch
CN101350016A (en) * 2007-07-20 2009-01-21 富士通株式会社 Device and method for searching three-dimensional model
CN101110826A (en) * 2007-08-22 2008-01-23 张建中 Method, device and system for constructing multi-dimensional address
CN103295025A (en) * 2013-05-03 2013-09-11 南京大学 Automatic selecting method of three-dimensional model optimal view
CN105243137A (en) * 2015-09-30 2016-01-13 华南理工大学 Draft-based three-dimensional model retrieval viewpoint selection method
US20180039856A1 (en) * 2016-08-04 2018-02-08 Takayuki Hara Image analyzing apparatus, image analyzing method, and recording medium
CN107122396A (en) * 2017-03-13 2017-09-01 西北大学 Three-dimensional model searching algorithm based on depth convolutional neural networks
CN109783887A (en) * 2018-12-25 2019-05-21 西安交通大学 A kind of intelligent recognition and search method towards Three-dimension process feature
CN110033023A (en) * 2019-03-11 2019-07-19 北京光年无限科技有限公司 It is a kind of based on the image processing method and system of drawing this identification
CN110569386A (en) * 2019-09-16 2019-12-13 哈尔滨理工大学 Three-dimensional model retrieval method based on hand-drawn sketch integrated descriptor
CN111078913A (en) * 2019-12-16 2020-04-28 天津运泰科技有限公司 Three-dimensional model retrieval method based on multi-view convolution neural network
CN111242207A (en) * 2020-01-08 2020-06-05 天津大学 Three-dimensional model classification and retrieval method based on visual saliency information sharing
CN111597367A (en) * 2020-05-18 2020-08-28 河北工业大学 Three-dimensional model retrieval method based on view and Hash algorithm
CN111625667A (en) * 2020-05-18 2020-09-04 北京工商大学 Three-dimensional model cross-domain retrieval method and system based on complex background image

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
LI ZHI LIN: "Improving Variational Auto-Encoder with Self-Attention and Mutual Information for Image Generation", 《HTTPS://DOI.ORG/10.1145/3376067.3376090》 *
方志祥等: "绝对空间定位到相对空间感知的行人导航研究趋势", 《武汉大学学报(信息科学版)》 *
王新颖等: "权值优化集成卷积神经网络及其在三维模型识别中的应用", 《图学学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113658176A (en) * 2021-09-07 2021-11-16 重庆科技学院 Ceramic tile surface defect detection method based on interactive attention and convolutional neural network
CN113658176B (en) * 2021-09-07 2023-11-07 重庆科技学院 Ceramic tile surface defect detection method based on interaction attention and convolutional neural network
CN114842287A (en) * 2022-03-25 2022-08-02 中国科学院自动化研究所 Monocular three-dimensional target detection model training method and device of depth-guided deformer
CN114842287B (en) * 2022-03-25 2022-12-06 中国科学院自动化研究所 Monocular three-dimensional target detection model training method and device of depth-guided deformer

Also Published As

Publication number Publication date
CN113032613B (en) 2022-11-08

Similar Documents

Publication Publication Date Title
Wang et al. Enhancing sketch-based image retrieval by cnn semantic re-ranking
CN108564129B (en) Trajectory data classification method based on generation countermeasure network
CN108038122B (en) Trademark image retrieval method
CN110598029A (en) Fine-grained image classification method based on attention transfer mechanism
CN112633350B (en) Multi-scale point cloud classification implementation method based on graph convolution
CN110222218B (en) Image retrieval method based on multi-scale NetVLAD and depth hash
CN112446388A (en) Multi-category vegetable seedling identification method and system based on lightweight two-stage detection model
CN106682233A (en) Method for Hash image retrieval based on deep learning and local feature fusion
CN106909924A (en) A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
CN110633708A (en) Deep network significance detection method based on global model and local optimization
CN104778242A (en) Hand-drawn sketch image retrieval method and system on basis of image dynamic partitioning
CN102385592B (en) Image concept detection method and device
CN112541532B (en) Target detection method based on dense connection structure
CN112115291B (en) Three-dimensional indoor model retrieval method based on deep learning
CN111125411A (en) Large-scale image retrieval method for deep strong correlation hash learning
CN110751027B (en) Pedestrian re-identification method based on deep multi-instance learning
CN113032613B (en) Three-dimensional model retrieval method based on interactive attention convolution neural network
WO2023019698A1 (en) Hyperspectral image classification method based on rich context network
CN105320764A (en) 3D model retrieval method and 3D model retrieval apparatus based on slow increment features
CN112733602B (en) Relation-guided pedestrian attribute identification method
CN110263855A (en) A method of it is projected using cobasis capsule and carries out image classification
CN115457332A (en) Image multi-label classification method based on graph convolution neural network and class activation mapping
CN113806580B (en) Cross-modal hash retrieval method based on hierarchical semantic structure
Zhang et al. Semisupervised center loss for remote sensing image scene classification
CN111125396A (en) Image retrieval method of single-model multi-branch structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant