CN113032613A

CN113032613A - Three-dimensional model retrieval method based on interactive attention convolution neural network

Info

Publication number: CN113032613A
Application number: CN202110270518.7A
Authority: CN
Inventors: 贾雯惠; 高雪瑶
Original assignee: Harbin University of Science and Technology
Current assignee: Harbin University of Science and Technology
Priority date: 2021-03-12
Filing date: 2021-03-12
Publication date: 2021-06-25
Anticipated expiration: 2041-03-12
Also published as: CN113032613B

Abstract

The invention provides a three-dimensional model retrieval method based on an interactive attention convolution neural network. The method comprises the steps of preprocessing a three-dimensional model, fixing a projection angle to obtain 6 views of the three-dimensional model, and converting the views into line graphs to serve as a view set of the three-dimensional model. Secondly, an interaction attention module is embedded in the convolutional neural network to extract semantic features, and data interaction between two network layers of the convolutional neural network is increased. And extracting global features by using a Gist algorithm and a two-dimensional shape distribution algorithm. Thirdly, calculating the similarity between the sketch and the two-dimensional view by using the Euclidean distance. These features are then combined with the weights to retrieve the three-dimensional model. The method makes up the problem of inaccurate semantic features caused by overfitting when a neural network is trained by using small sample data, and improves the accuracy of three-dimensional model retrieval.

Description

Three-dimensional model retrieval method based on interactive attention convolution neural network

The technical field is as follows:

the invention relates to a three-dimensional model retrieval method based on an interactive attention convolution neural network, which is well applied to the field of three-dimensional model retrieval.

Background art:

in recent years, with the increasing development of science and technology, three-dimensional models have important roles not only in many professional fields but also widely spread in daily life of people, and the demand of people for searching three-dimensional models is gradually increased. The test objects of the example-based three-dimensional model retrieval can only be models in a database, so that certain universality is lacked. The three-dimensional model retrieval based on the sketch can be drawn at will according to the requirements of users, is convenient and applicable, and has wide prospects.

Currently, some common algorithms use a single manual feature or deep-learning algorithm pair to solve the sketch-based model retrieval problem. However, the traditional manual features have defects, researchers need a large amount of prior knowledge, the setting of parameters needs to be manually set in advance, and the extracted feature effect is not imagined. The parameters can be automatically adjusted by using a deep learning algorithm, so that the method has good expansibility. But it also has certain drawbacks. Because the number of nodes of the deep neural network is large, a large amount of data is needed to train the neural network to obtain excellent results, and once the training data amount is insufficient, overfitting is caused, and the obtained results are also deviated. In order to obtain a better retrieval result on the premise of insufficient training samples, the invention provides a three-dimensional model retrieval method based on an interactive attention convolution neural network.

The invention content is as follows:

the invention discloses a three-dimensional model retrieval method based on an interactive attention convolution neural network, aiming at solving the problem that the retrieval effect of a three-dimensional model retrieval method based on a sketch is poor on the premise of insufficient training samples.

Therefore, the invention provides the following technical scheme:

1. a three-dimensional model retrieval method based on an interactive attention convolution neural network is characterized by comprising the following steps:

step 1: and carrying out data preprocessing, projecting the three-dimensional model to obtain a plurality of views corresponding to the three-dimensional model, and obtaining an edge view set of the model by using an edge detection algorithm.

Step 2: designing a deep convolutional neural network, and optimizing a network model by using an interactive attention module. And selecting one part of the view sets as a training set, and the other part of the view sets as a test set.

And step 3: the training includes two processes, forward propagation and backward propagation. And training the training data serving as input of the interactive attention convolution neural network model to obtain the optimized interactive attention convolution neural network model.

And 4, step 4: and respectively extracting semantic features of the freehand sketch and the model view by using the optimized interactive attention convolution neural network model and the gist feature, and respectively extracting two-dimensional shape distribution features of the freehand sketch and the model view by using the two-dimensional shape distribution feature.

And 5: and fusing the plurality of features in a weighting mode. And retrieving the model which is most similar to the hand-drawn sketch according to the Euclidean distance.

2. The method for retrieving a three-dimensional model based on an interactive attention convolutional neural network as claimed in claim 1, wherein in the step 1, the three-dimensional model is projected to obtain a plurality of views corresponding to the three-dimensional model and an edge detection algorithm is used to obtain an edge view set of the model, and the specific steps are as follows:

step 1-1, setting a three-dimensional model at the center of a virtual sphere;

step 1-2, placing a virtual camera above the model, and rotating the model by 360 degrees by 30 degrees in each step to obtain 12 view sets of the three-dimensional model;

1-3, obtaining respective edge views of 12 original view sets by using a Canny edge detection algorithm;

after the three-dimensional model is projected, the three-dimensional model is characterized into a group of two-dimensional views, and the semantic gap between the hand-drawn sketch and the three-dimensional model view can be reduced by using a Canny edge detection algorithm.

3. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 2, the deep convolutional neural network is designed, and the network model is optimized by using the interactive attention module, and the specific steps are as follows:

step 2-1, determining the depth of a convolutional neural network, the size of a convolutional kernel, and the number of convolutional layers and pooling layers;

step 2-2 design Interactive attention Module, in volumeThe output of the build-up layer is connected with a global pooling layer to obtain the conv of the build-up layer_nAmount of information Z in each channel_kThe information amount calculation formula is as follows:

wherein, conv_nkA kth feature map of size W representing the output of the nth convolutional layer_n*H_n。

Step 2-3, connecting two full connection layers after the global pooling layer, and adaptively adjusting the attention weight S of each channel according to the information quantity_knThe weight is calculated as follows:

S_kn＝F_ex(Z,W)＝σ(g(Z,W))＝σ(W₂δ(W₁Z))

wherein, delta is a Relu function, and sigma is a sigmoid function. W₁、W₂The weights of the first full connection and the second full connection, respectively.

Step 2-4 calculating the interactive attention weight S of two neighborhood convolution layers respectively_k1And S_k2And fusing the data to obtain the optimal attention weight S_kThe calculation formula of the optimal attention weight is as follows:

S_k＝Average(S_k1,S_k2)

step 2-5 will note the weight S_kAnd second convolution layer conv₂The first pooling layer a_pFusing to obtain final result a₂The fused calculation formula is as follows:

and selecting one part of the view sets as a training set, and the other part of the view sets as a test set.

4. The method for retrieving the three-dimensional model based on the interactive attention convolutional neural network as claimed in claim 1, wherein in the step 3, the convolutional neural network model is trained, and the specific steps are as follows:

step 3-1, inputting training data into an initialized interactive attention convolution neural network model;

step 3-2, extracting more detailed view characteristics through the convolutional layers, extracting low-level characteristics through the shallow-level convolutional layers, and extracting high-level semantic characteristics through the high-level convolutional layers;

3-3, after the attention module is fused with the neighborhood convolution layer through the weighting channel, reducing information lost when the edge view of the hand-drawn sketch or the model is pooled;

step 3-4, reducing the scale of the view characteristics through a pooling layer, thereby reducing the number of parameters and accelerating the speed of model calculation;

step 3-5, passing through a Dropout layer, reducing the overfitting problem caused by insufficient training samples;

3-6, after alternately operating convolution, an attention module, Dropout and pooling, finally inputting a full connection layer, and reducing the dimension of the extracted features to connect the extracted features into a one-dimensional high-level semantic feature vector;

steps 3-7 use the labeled 2D view to optimize the weights and biases of the interactive attention convolutional neural network in the back propagation process. The 2D view-set is { v }₁,v₂，…，v_nIs set of { l } labels₁,l₂，…,l_n}. The 2D view has t classes including 1,2, …, t. After forward propagation, v_iThe prediction probability in class j is y _ testj. V is to be_iLabel l of_iComparing with the category j, calculating the expected probability y_ijThe formula for calculating the probability is as follows:

step 3-8 will predict the probability y _ test_ijAnd true probability y_jA comparison is made and the error loss is calculated using a cross entropy loss function.

The error loss is calculated as follows:

and continuously iterating the interactive attention convolution neural network model to obtain an optimized interactive attention convolution neural network model, and storing the weight and the bias.

5. The method for retrieving the three-dimensional model based on the interactive attention convolution neural network as claimed in claim 1, wherein in the step 4, the optimized interactive attention convolution neural network model and the gist feature are used to extract semantic features of the freehand sketch and the model view respectively, and the two-dimensional shape distribution feature of the freehand sketch and the model view is extracted respectively by using the two-dimensional shape distribution feature, and the specific process is as follows:

step 4-1, inputting test data into the optimized interactive attention convolution neural network model;

and 4-2, extracting the features of the full connection layer to be used as high-level semantic features of the hand-drawn sketch or the model view.

Step 4-3 divides the sketch or 2D view of size m x n into 4 x 4 blocks. The size of each block is a b, where a is m/4 and b is n/4.

Step 4-4 each block is processed by 32 Gabor filters of 4 scales, 8 directions. And combining the processed features to obtain gist features. The formula is as follows:

wherein i is 4 and j is 8. G (x, y) is the gist feature of the 32 Gabor filters, and cat () represents the stitching operation. Here, x and y are positions of pixels, and I (x, y) denotes a block. At the same time, g_ij(x, y) are filters for the ith scale and the jth direction. Denotes convolution operation.

Step 4-5, randomly and equidistantly sampling points on the boundary of the sketch or the 2D view, and collecting the points as { (x)₁,y₁)，…，(x_i,y_i)，…，(x_n,y_n)}. Here (x)_i,y_i) Are point coordinates.

Steps 4-6 use the D1 descriptor to represent the distance between the centroid and the random sample point on the boundary of the sketch or two-dimensional view. Extracting dots from the dots, and collecting PD1 ═ ai₁，…，ai_k，…，ai_N}. D1 set of shape distribution features as { D1_ v₁，…，D1_v_i，…，D1_v_Bins}. Wherein D1_ v_iIs a statistic of intervals (BinsSize × (i-1), BinsSize ×, i), Bins is the number of intervals, and BinsSize is the length of the intervals. D1_ v_iThe calculation formula of (a) is as follows:

D1_v_i＝|{P|dist(P,O)∈(BinSize*(i-1),BinSize*i),P∈PD1}|

where BinsSize ═ max ({ dist (P, O) | P ∈ PD1})/N, dist () is the euclidean distance between two points. O is the centroid of the sketch or 2D view.

Steps 4-7 describe the distance between two random sample points on the boundary of the sketch or two-dimensional view using the D2 descriptor. Extracting point pairs from the points, and collecting the point pairs as PD2 { (ai)₁,bi₁)，(ai₂,bi₂)，…，(ai_N,bi_N)}. D2 set of shape distribution features as { D2_ v₁，…，D2_v_i，…，D2_v_Bins}. Here, D2_ v_iStatistics in the intervals (BinSize × (i-1), BinSize × (i)) are represented. D2_ v_iThe calculation formula is as follows:

D2_v_i＝|{P|dist(P)∈(BinSize*(i-1),BinSize*i),P∈PD2}|

where BinsSize ═ max ({ dist (P) | P ∈ PD2 })/N.

Steps 4-8 utilize the D3 descriptor to describe the square root of the area formed by the 3 random sample points on the boundary of the sketch or 2D view. Extracting point triplets from the points and collecting PD3 { (ai)₁,bi₁,ci₁)，(ai₂,bi₂,ci₂)，…，(ai_n,bi_n,ci_n)}. D3 set of shape distribution features as { D3_ v₁，…，D3_v_i，…，D3_v_Bins}. Here, D3_ v_iRepresents statistical information in the section (BinSize (i-1), BinSize i)。D3_v_iThe calculation formula is as follows:

D3_v_i＝|{P|herson(P)∈(BinSize*(i-1),BinSize*i),P∈PD3}|

wherein the content of the first and second substances,

herson () stands for helln formula, and calculates the triangle P ═ P (P) using helln formula₁,P₂,P₃) The calculation formula is as follows:

wherein a ═ dist (P)₁,P₂),b＝dist(P₁,P₃),c＝dist(P₂,P₃).

Step 4-9D1_ v_i,D2_v_i,D3_v_iThe connection forms a shape distribution characteristic, i ═ 1,2, …, Bins.

6. The three-dimensional model retrieval method based on interactive attention CNN and weighted similarity calculation of claim 1, wherein in the step 5, a plurality of features are fused, and a model most similar to a hand-drawn sketch is retrieved according to a similarity measurement formula by the following specific processes:

step 5-1, selecting Euclidean distance as a similarity measurement method;

and 5-2, extracting feature vectors from the two-dimensional view and the sketch by using the improved interactive attention convolution neural network, and normalizing the feature vectors. Calculating similarity by using Euclidean distance, marking as distance1, and calculating the accuracy of retrieval, and marking as t 1;

and 5-3, extracting the feature vectors of the sketch and the model view by using the gist features, and normalizing the feature vectors. Calculating similarity by using Euclidean distance, marking as distance2, and calculating the accuracy of retrieval, and marking as t 2;

and 5-4, extracting a feature vector between the sketch and the model view by using the two-dimensional shape distribution feature, and performing normalization processing on the feature vector. Calculating similarity by using Euclidean distance, marking as distance3, and calculating the accuracy of retrieval, and marking as t 3;

and 5-5, comparing the accuracy of the three features, and performing weighted fusion on the features to form new feature similarity sim (distance). The formula is as follows:

Sim(distance)＝w₁*distance1+w₂*distance2+w₃*distance，w₁+w₂+w₃＝1

wherein, w₁＝t₁/(t₁+t₂+t₃)，w₂＝t₂/(t₁+t₂+t₃)，w₃＝t₃/(t₁+t₂+t₃)

And 5-6, sorting according to the similarity from small to large to realize the retrieval effect.

Has the advantages that:

1. the invention discloses a three-dimensional model retrieval method based on an interactive attention convolution neural network. Model search was performed based on the SHREC13 database and the ModelNet40 database. Experimental results show that the method has high accuracy.

2. The retrieval model used by the invention is an interactive attention module and a convolutional neural network model, and the convolutional neural network has the capability of local perception and parameter sharing, can well process high-dimensional data and does not need to manually select data characteristics. The proposed interaction attention model combines the attention weights of two adjacent convolutional layers to realize the interaction of data between two network layers. And a better retrieval effect can be obtained by the trained convolutional neural network model.

3. And when the model is trained, updating parameters by adopting a random gradient descent method. The error returns along the original route through back propagation, namely, each layer of parameters are updated layer by layer from the output layer through each middle hidden layer in the reverse direction, and finally the error returns to the output layer. Forward and backward propagation are continuously performed to reduce errors and update model parameters until the CNN is trained.

4. The invention improves the distribution characteristics of the three-dimensional shape, so that the method is suitable for sketch and two-dimensional view. Shape information of the sketch and the three-dimensional model view is described using a shape distribution function.

5. The invention adopts a mode of self-adaptive fusion of various characteristics to perform similarity fusion on the provided characteristics, thereby realizing better retrieval effect.

Description of the drawings:

fig. 1 is a sketch to be retrieved in an embodiment of the present invention.

Fig. 2 is a three-dimensional model search framework diagram according to an embodiment of the present invention.

FIG. 3 is a projection view of a model in an embodiment of the invention.

Fig. 4 is a Canny edge view in an embodiment of the invention.

FIG. 5 is a model of an interactive attention convolutional neural network in an embodiment of the present invention.

FIG. 6 illustrates a training process of the Interactive attention convolutional neural network in an embodiment of the present invention.

FIG. 7 illustrates a testing process of the Interactive attention convolutional neural network in an embodiment of the present invention.

The specific implementation mode is as follows:

in order to clearly and completely describe the technical solutions in the embodiments of the present invention, the present invention is further described in detail below with reference to the drawings in the embodiments.

The invention uses the sketch of SHREC13 and the data of model Net40 model base to carry out experimental verification. Take "17205. png" in the SHREC13 sketch and "table _0399. off" in the model library of ModelNet40 as examples. The sketch to be retrieved is shown in fig. 1.

The experimental frame diagram of the three-dimensional model retrieval method based on the interactive attention convolution neural network is implemented, as shown in fig. 2, and comprises the following steps:

step 1, projecting the three-dimensional model to obtain a three-dimensional model edge view set, which specifically comprises the following steps:

step 1-1, a table _0399.off file is placed in the center of a virtual sphere.

Step 1-2, placing a virtual camera above the model, and rotating the model by 360 degrees at each step by 30 degrees, so as to obtain 12 view sets of the three-dimensional model, wherein one view is taken as an example for display, and the projection view of the model is shown in fig. 3;

the views obtained by steps 1-3 using the Canny edge detection algorithm are shown in fig. 4;

step 2, designing a deep convolutional neural network, and optimizing a network model by using an interactive attention module, as shown in fig. 5, specifically:

and 2-1, designing a deep convolutional neural network for better characteristic extraction effect, wherein the deep convolutional neural network comprises 5 convolutional layers, 4 pooling layers, two dropout layers, a connecting layer and a full connecting layer.

Step 2-2, embedding the interactive attention module into the designed convolutional neural network structure, connecting a global pooling layer after the output of the convolutional layer, and solving the information quantity Z of each channel in the convolutional layer_k. Taking the sketch as an example, the first convolution layer information amount of the sketch is as follows:

Z_k＝[[0.0323739 0.04996519 0.0190248 0.03274497 0.03221277 0.00206719 0.04075038 0.01613641 0.03390235 0.04024649 0.03553107 0.00632962 0.03442683 0.04588291 0.01900478 0.02144121 0.03710039 0.03861086 0.05596253 0.0439686 0.03611921 0.04850776 0.00716817 0.02596463 0.00525256 0.03657651 0.02809189 0.03490375 0.04528182 0.03938764 0.00690786 0.04449471]]

step 2-3, connecting two full connection layers after the global pooling layer, and adaptively adjusting the attention weight S of each channel according to the information quantity_kn. Taking the sketch as an example, the attention weights of the sketch are as follows:

S_kn＝[[0.49450904 0.49921992 0.50748134 0.5051483 0.5093386 0.49844238 0.50426346 0.50664175 0.5053692 0.5012332 0.5004162 0.49788538 0.505669 0.5012219 0.5009724 0.4942028 0.49796405 0.4992011 0.5064934 0.4963113 0.50500274 0.50238824 0.50202376 0.49661288 0.50185806 0.5048757 0.5073203 0.50703263 0.51684725 0.50641936 0.5052296 0.4979179]]

step 2-4 calculating the interactive attention weight S of two neighborhood convolution layers respectively_k1And S_k2And fusing the data to obtain the optimal attention weight S_kThe optimal attention weight for the sketch is as follows:

S_k＝[[0.4625304 0.47821882 0.5064253 0.5032532 0.5093386 0.49877496 0.50426346 0.50664175 0.5053692 0.5012332 0.5004162 0.49784237 0.505688 0.5011142 0.5008647 0.4942028 0.49796405 0.4991069 0.5064934 0.4963113 0.5102687 0.50125698 0.502524856 0.49675384 0.49365704 0.5027958 0.5076529 0.50814523 0.51006527 0.50361942 0.50422731 0.4635842]]

step 2-5 will note the weight S_kAnd second convolution layer conv₂The first pooling layer a_pFusing to obtain final result a₂Partial results for the second convolution layer of the sketch are:

a₂＝[[[[0.14450312 0.0644969 0.10812703...0.18608719 0.01994037 0]

[0.18341058 0.15881275 0.24716881...0.18875208 0.14420813 0.08290599]

[0.17390229 0.14937611 0.2255666...0.15295741 0.18792515 0.08066748]

...

[0.31344187 0.18656467 0.22178406...0.22087486 0.22130579 0.00955889]

[0.12405898 0.10548315 0.11685486...0.10439464 0.2906406 0.14846338]]

[[0.10032222 0.21919143 0.09797319...0.13584027 0.0.12112971]

[0.20946684 0.14252397 0.17954415...0.09708451 0.0.15463363]

[0.06941956 0.03963253 0.13273408...0.00173131 0.04566149 0.14895247]

...

[[0.01296724 0.27460644 0.09022377...0.06938899 0.04487894 0.2567152]

[0.16118288 0.38024116 0.02033611...0.13374138 0 0.17068687]

[0.09430372 0.35878736 0...0.0846955 0 0.25289127]

...

[0.10363265 0.4103881 0...0.0728834 0 0.29586816]

[0.18578637 0.34666267 0...0.05323519 0 0.27042198]

[0.0096841 0.18718664 0...0.04646093 0.00576336 0.155898]]]]

step 3, training the convolutional neural network model, as shown in fig. 6, specifically comprising the steps of:

step 3-1, inputting the sketch and the edge two-dimensional view into an initialized interactive attention convolution neural network as training data;

step 3-2, extracting more detailed view characteristics through the convolution layer;

3-3, after the attention module is fused with the neighborhood convolution layer through the weighting channel, the information lost when the edge view of the hand-drawn sketch or the model is pooled can be reduced;

step 3-4, extracting the maximum view information through a pooling layer;

steps 3-7 learned by the softmax function that the probability of the sketch "17205. png" under the "table" category is 89.99%

Wherein loss₁₇₂₀₅Error of the sketch "17205. png".

And continuously iterating the interactive attention convolution neural network model to obtain the optimized interactive attention convolution neural network model.

Step 4, extracting semantic features and shape distribution features, specifically:

step 4-1, inputting the test data into the optimized interactive attention convolution neural network model, wherein the test process is shown in FIG. 7;

and 4-2, extracting the features of the full connection layer to be used as high-level semantic features of the hand-drawn sketch or the model view. Part of the high-level semantic features of the extracted sketch are as follows:

Feature＝[[0,0.87328064,0,0,1.3293583,0,2.3825126,0,0,4.8035927,0,1.5186063,0,3.6845286,1.0825952,0,1.8516512,1.0285587,0,0,0,3.3322043,1.0545557,0,0,4.8707848,3.042554,0,0,0,0,6.8227463,2.537525,1.5318785,2.7271123,0,3.0482264……]]

step 4-3 dividing the size sketch or two-dimensional view into 4 x 4 blocks;

step 4-4 each block is processed by 32 Gabor filters of 4 scales, 8 directions. And combining the processed features to obtain gist features. The Gist feature is extracted 512 dimensions, and the partial Gist feature of the sketch is as follows:

G(x,y)＝[[5.81147151e-03 1.51588341e-02 1.75721212e-03 2.10059434e-01 1.62918585e-01 1.54040498e-01 1.44374291e-01 8.71880878e-01 5.26758657e-01 4.14263371e-01 7.17606844e-01 6.22190594e-01 1.11205845e-01 7.69002490e-04 2.18182730e-01 2.29565939e-01 9.32599080e-03 1.10805327e-02 1.40071468e-03 2.58543039e-01 5.67934220e-02 1.06132064e-01 9.10082146e-02 4.02163211e-01 2.97883778e-01 2.45860956e-01 4.02066928e-01 2.84401506e-01

1.03228724e-01 6.37419945e-04 2.71290458e-01……]]

step 4-5, randomly and equidistantly sampling points on the boundary of the sketch or the two-dimensional view;

steps 4-6 use the D1 descriptor to represent the distance between the centroid and the random sample point on the boundary of the sketch or two-dimensional view. The portion D1 descriptor of the sketch is as follows:

D1＝[0.30470497858541628,0.6256941275550102,0.11237884569183111,0.23229854666522,0.2657159486944761,0.0731852015843772,0.40751749800795261……]

steps 4-7 describe the distance between two random sample points on the boundary of the sketch or two-dimensional view using the D2 descriptor. The portion D2 descriptor of the sketch is as follows:

D2＝[0.13203683803844625,0.028174099301372796,0.15392681513105217,0.130238265264,0.123460163767958,0.06985106421513015,0.12992235205980568……]

steps 4-8 utilize the D3 descriptor to describe the square root of the area formed by the 3 random sample points on the boundary of the sketch or two-dimensional view. The portion D3 descriptor of the sketch is as follows:

D3＝[0.9193157274532394,0.5816923854309814,0.46980644879802125,0.498873567635874,0.7195175116705602,0.29425190983247506,0.8724092377243926……]

step 4-9, connecting D1, D2 and D3 in series to form a shape distribution characteristic;

and 5, fusing a plurality of characteristics of the sketch, and retrieving a model most similar to the hand-drawn sketch according to the similarity measurement formula, wherein the method specifically comprises the following steps:

step 5-1, comparing various similarity retrieval methods, wherein the final effect is best in Euclidean distance;

and 5-2, extracting feature vectors from the two-dimensional view and the sketch by using the improved interactive attention convolution neural network, and normalizing the feature vectors. Calculating the similarity by using the Euclidean distance, marking as distance1, and the retrieval accuracy is 0.96;

and 5-3, extracting the feature vectors of the sketch and the model view by using the gist features, and normalizing the feature vectors. Calculating the similarity by using the Euclidean distance, marking as distance2, and the retrieval accuracy is 0.53;

and 5-4, extracting a feature vector between the sketch and the model view by using the two-dimensional shape distribution feature, and performing normalization processing on the feature vector. Calculating the similarity by using the Euclidean distance, marking as distance3, and the accuracy of retrieval is 0.42;

and 5-5, determining the weight according to the retrieval accuracy of the three characteristics.

The final weight is determined as: 5:3:2

Sim(distance)＝0.5*distance1+0.3*distance2+0.2*distance

The three-dimensional model retrieval method based on the interactive attention convolution neural network in the embodiment of the invention adopts a traditional characteristic and depth characteristic weighting fusion mode, thereby realizing better retrieval effect.

The foregoing is a detailed description of embodiments of the invention, taken in conjunction with the accompanying drawings, wherein the specific embodiments are merely provided to assist in understanding the method of the invention. For those skilled in the art, the invention can be modified and adapted within the scope of the embodiments and applications according to the spirit of the present invention, and therefore the present invention should not be construed as being limited thereto.

Claims

step 1-1, setting a three-dimensional model at the center of a virtual sphere;

step 2-2, designing an interactive attention module, connecting a global pooling layer after the output of the convolutional layer, and solving the conv of the convolutional layer_nAmount of information Z in each channel_kThe information amount calculation formula is as follows:

S_kn＝F_ex(Z,W)＝σ(g(Z,W))＝σ(W₂δ(W₁Z))

S_k＝Average(S_k1,S_k2)

step 3-8 will predict the probability y _ test_ijAnd true probability y_jThe comparison is carried out in such a way that,the error loss is calculated using a cross entropy loss function.

The error loss is calculated as follows:

wherein i is 4 and j is 8. G (x, y) is the gist feature of the 32 Gabor filters, and cat () represents the stitching operation. Here, x and y are positions of pixels, and I (x, y) denotes a block. At the same time, g_ij(x, y) are filters for the ith scale and the jth direction. Representing convolution operationAnd (4) calculating.

D1_v_i＝|{P|dist(P,O)∈(BinSize*(i-1),BinSize*i),P∈PD1}|

D2_v_i＝|{P|dist(P)∈(BinSize*(i-1),BinSize*i),P∈PD2}|

where BinsSize ═ max ({ dist (P) | P ∈ PD2 })/N.

Steps 4-8 utilize the D3 descriptor to describe the square root of the area formed by the 3 random sample points on the boundary of the sketch or 2D view. Extracting point triplets from the points and collecting PD3 { (ai)₁,bi₁,ci₁)，(ai₂,bi₂,ci₂)，…，(ai_n,bi_n,ci_n)}. D3 set of shape distribution features as { D3_ v₁，…，D3_v_i，…，D3_v_Bins}. Here, D3_ v_iThe statistical information in the intervals (BinSize × (i-1), BinSize × i) is shown. D3_ v_i

D3_v_i＝|{P|herson(P)∈(BinSize*(i-1),BinSize*i),P∈PD3}|

Wherein the content of the first and second substances,

herson () stands for helln formula, and calculates the triangle P ═ P (P) using helln formula₁,P₂P3), the calculation formula is as follows:

wherein a ═ dist (P)₁,P₂),b＝dist(P₁,P₃),c＝dist(P₂,P₃).

Steps 4-9D1_ vi, D2_ vi, D3_ vi are concatenated to form a shape distribution feature, i ═ 1,2, …, Bins.

step 5-1, selecting Euclidean distance as a similarity measurement method;

and 5-5, comparing the accuracy of the three features, and performing weighted fusion on the features to form a new feature similarity distance. The formula is as follows: