CN114547357B

CN114547357B - Furniture model retrieval method based on voice and sketch

Info

Publication number: CN114547357B
Application number: CN202210043225.XA
Authority: CN
Inventors: 武仲科; 王醒策; 赵禹铭; 徐墅; 王虎镇
Original assignee: Beijing Normal University
Current assignee: Beijing Normal University
Priority date: 2022-01-14
Filing date: 2022-01-14
Publication date: 2024-08-16
Anticipated expiration: 2042-01-14
Also published as: CN114547357A

Abstract

The invention belongs to the technical field of furniture model retrieval in computer aided design, and particularly relates to a furniture model retrieval method based on voice and sketch, which comprises the following steps: s1, constructing a database based on a neural network, standard keyword description and model description semantic features; s2, inputting a two-dimensional sketch into a database to obtain a sketch retrieval result; s3, inputting voice into a database to obtain a voice retrieval result; s4, obtaining a final search result through a voting mode based on the sketch search result and the voice search result. The quick and accurate furniture model retrieval method disclosed by the invention saves a great deal of design cost and time expenditure, so that the utilization rate and the use value of the furniture model are improved.

Description

Furniture model retrieval method based on voice and sketch

Technical Field

The invention belongs to the technical field of furniture model retrieval in computer aided design, and particularly relates to a furniture model retrieval method based on voice and sketch.

Background

With the development of computer-aided designs, furniture designs increasingly rely on computer-aided designs. In the design of home scenes, the retrieval of furniture models is a critical issue. Due to the wide variety and number of furniture models, home designers need quick and accurate methods to retrieve furniture models. The furniture retrieval task mainly comprises two aspects of extracting model features and establishing a relationship between input and the model features.

The following search methods are generally used: text-based retrieval and sketch-based retrieval. The method based on text retrieval needs to carry out a large number of manual labels on the model, stores the model characteristics in a manual label mode, and compares the model characteristics with input text to obtain a retrieval result. And the sketch-based retrieval is performed by drawing a sketch of a model to be retrieved by a user, extracting the sketch features by the system, and calculating the similarity between the sketch features and the model features to obtain a retrieval result. The method has the advantages of no need of a large amount of manual labeling, and lower accuracy compared with the traditional text retrieval.

In text-based retrieval, due to the ambiguity of the input, a complete match with the model tag key is typically not possible, requiring text features to be expressed in a semantic-based manner. Word vector embedding is a semantic expression technique, and there are two types of word vector embedding methods: fixed representation and dynamic representation. The fixed representation method comprises the following steps: word2Vec, gloVe, fastText, etc., the dynamic representation method is: ELMo, bRT, etc. In sketch-based retrieval, two general categories can be distinguished: and extracting three-dimensional model features based on projection, and directly extracting three-dimensional model global features.

For example, the Chinese patent application number is: CN201810597066.1 discloses a three-dimensional model retrieval method based on sketch, which processes the three-dimensional model into a form of a plurality of view angle screenshots, then uses different contour extraction operators to process the three-dimensional model into sketches with different styles, and obtains a sketch dataset, and the dataset is used in training and testing processes after being provided with labels; the invention relates to a method for searching a furniture model based on voice and sketch, which is characterized in that a hierarchical network method is added on the basis of a convolutional neural network, 1 large classification network is used for training large classifications, the input sketch is classified into 40 large classes, 40 small classification networks respectively train specific differences in each class of models, the input sketch is classified into one specific three-dimensional model in a certain large class, the three-dimensional model can be searched with high accuracy, the information redundancy is small, an angle matrix calculated by using sampling contour points is used as the input of the convolutional neural network, the characteristics of the sketch are more consistent, and the searching accuracy is high, but a furniture model searching method based on voice and sketch is not proposed.

Disclosure of Invention

In order to solve the problems, the invention provides a furniture model retrieval method based on voice and sketch;

S1, constructing a database based on a neural network, standard keyword description and model description semantic features;

S2, inputting a two-dimensional sketch into a database to obtain a sketch retrieval result;

s3, inputting voice into a database to obtain a voice retrieval result;

s4, obtaining a final search result through a voting mode based on the sketch search result and the voice search result.

Further, the constructing the database based on the neural network, the standard keyword description and the model description semantic features in S1 includes:

s101, constructing a pseudo-twin neural network for calculating similarity calculation of two-dimensional sketch features and three-dimensional model features, taking the three-dimensional model features and the two-dimensional sketch features as input values of the pseudo-twin neural network, and extracting features of three-dimensional models in a database by adopting an LD-SIFT descriptor mode to obtain three-dimensional model features as input values of the pseudo-twin neural network;

S102, constructing standard keyword descriptions for standardizing keyword descriptions of input search texts, acquiring texts corresponding to all three-dimensional models in advance, inputting the texts into a word2Vec pre-training model for word vector embedding, and obtaining vector representations of all model description keywords as all standard keyword descriptions;

S103, constructing all model description semantic features corresponding to all standard keyword descriptions, acquiring standard keyword descriptions corresponding to each three-dimensional model, and calculating IDF weights corresponding to each three-dimensional model by adopting IDF through the standard keyword descriptions to serve as all model description semantic features;

S104, constructing a database by constructing a pseudo twin neural network, standard keyword description and model description semantic features.

Further, the step S2 of inputting the two-dimensional sketch into the database to obtain a sketch retrieval result includes:

s201, drawing a two-dimensional sketch of a model to be searched by a user, inputting the two-dimensional sketch into a trained VGG-19 model, and acquiring a 4096-dimensional vector as an input sketch characteristic through an fc-7 layer of the trained VGG-19 model;

s202, inputting the input sketch features and the three-dimensional model features into a pseudo-twin neural network, and obtaining the similarity of the input sketch features and the three-dimensional model features as sketch retrieval results.

Further, in S202, the similarity between the input sketch feature and the three-dimensional model feature is calculated as the following formula (1):

In the method, in the process of the invention, Representing input sketch features and three-dimensional model feature similarities,A contrast loss function is represented and is used,Represented as a sample feature 1,The characteristic is expressed as a sample characteristic 2, W is a neural network parameter, Y is expressed as whether samples are matched, and when Y is 1, the samples are matched; at 0, the samples do not match.

Further, in S202, the contrast loss function is calculated as follows (2):

Where N is the number of samples, D _W is the euclidean distance between sample X ₁ and sample X ₂, m is the set upper distance threshold, and max (m-D _w, 0) is the maximum value of m-D _w, 0.

Further, in S202, the euclidean distance is calculated as follows (3):

In the method, in the process of the invention, the term "is expressed as the modulo length of the vector, X ₁ represents sample 1 and X ₂ represents sample 2.

Further, in S3, the step of inputting the voice into a pre-constructed database to obtain a voice search result includes:

s301, converting input voice into text data by adopting a voice recognition algorithm, performing word segmentation and word removal operation on the text data to obtain a keyword description sequence, and inputting the keyword description sequence into a word2Vec pre-training model to perform word vector embedding to obtain a keyword vector;

S302, carrying out similarity calculation on the keyword vector and standard keyword description corresponding to the database, converting the keyword description sequence into a standard keyword sequence through an optimal similarity principle, and carrying out IDF feature extraction on the standard keyword sequence to obtain IDF weight corresponding to the standard keyword sequence;

s303, determining a three-dimensional model based on the corresponding standard keyword description, and selecting corresponding model description semantic features through the three-dimensional model to serve as IDF weights corresponding to the three-dimensional model in a database;

S304, obtaining an inner product corresponding to the three-dimensional model through the IDF weight corresponding to the standard keyword sequence and the IDF weight corresponding to the three-dimensional model in the database, and taking the inner product as the similarity between the output voice and the three-dimensional model in the pre-built database.

Further, in S303, the IDF weight is calculated as follows (4):

Where W _tag represents the weight, N _model represents the number of three-dimensional models, and N _tag represents the number of labels corresponding to the standard keyword sequence.

Further, in S4, the obtaining the final search result based on the sketch search result and the voice search result through the voting method includes:

S401, respectively acquiring the similarity of all three-dimensional models according to the input two-dimensional sketch and the input voice, respectively sequencing all the three-dimensional models from large to small according to the similarity, and distributing weights;

s402, voting the two search results according to the weights to obtain total weights, and sorting the two search results according to the total weights from high to low to obtain a final search result.

Compared with the prior art, the invention has the following beneficial effects:

1. The furniture model retrieval method based on the voice and the sketch overcomes the defects brought by a single retrieval method, combines the advantages of the sketch retrieval method and the voice retrieval method, achieves higher retrieval accuracy under the condition of insufficient labeling of text keywords, and meets higher retrieval quality requirements.

2. The furniture model retrieval method based on the voice and the sketch designs a quick and accurate furniture model retrieval method, and can save a great deal of design cost and time expenditure, thereby improving the utilization rate and the use value of the furniture model.

3. According to the furniture model retrieval method based on the voice and the sketch, the voice information and the sketch information are effectively extracted, the similarity comparison of heterogeneous data is realized through fusion, in the sketch retrieval, the sketch features and the model features are extracted by using VGG19 and LD-SIFT respectively, the registration of the features is carried out in a pseudo twin neural network, the two features are mapped into the same vector space, and the retrieval is carried out through calculating the similarity; in the retrieval based on voice, a Word2vec model is trained by using furniture related corpus, a semantic relation library of furniture keywords is established, the research is carried out by converting voice and text, extracting the furniture description keywords in the text, constructing keyword semantic relation, carrying out standard keyword conversion, giving weight to each keyword by using inverse document frequency, converting each furniture model description and query into vectors, realizing retrieval by similarity calculation of the vectors, and finally, merging two independent retrieval results by using a voting method, thereby realizing a multi-mode combined retrieval mode.

Drawings

FIG. 1 is a block flow diagram of a furniture model retrieval method based on voice and sketch according to the invention;

fig. 2 is a schematic flow chart of the furniture model searching method based on voice and sketch according to the invention, wherein the final searching result is obtained by voting based on the sketch searching result and the voice searching result.

Detailed Description

The invention is described in further detail below with reference to the drawings and the detailed description.

The invention provides a joint search method based on the existing model search method based on voice and the model search method based on sketch, and simultaneously refers to the results of the two search modes of the sketch search method and the voice search method, so that the search result is more accurate and the search mode is more efficient. As shown in fig. 1 and 2, the furniture model retrieval method based on voice and sketch includes:

s1, constructing a database based on a neural network, standard keyword description and model description semantic features:

The construction of the database mainly comprises 3 parts, namely, the construction of a neural network for calculating the similarity between a two-dimensional sketch and a three-dimensional model, the construction of standard keyword description and the construction of model description semantic features:

1) And constructing a neural network for calculating the similarity between the two-dimensional sketch and the three-dimensional model, and constructing the neural network for calculating the similarity between the two, wherein the characteristics of the two are extracted firstly. Extracting the features of the two-dimensional sketch by adopting VGG-19, extracting the fc-7 layer of a pre-trained VGG-19 model as the features of the input two-dimensional sketch, extracting the features of the three-dimensional model by adopting LD-SIFT descriptors, and constructing a pseudo-twin neural network on feature similarity calculation to train the features of the three-dimensional model and the features of the two-dimensional sketch so as to obtain a pseudo-twin neural network capable of calculating the similarity of the sketch and the model, wherein a loss function adopts a contrast loss function as an index of similarity measurement;

2) Constructing standard keyword descriptions, namely embedding word vectors into original model feature descriptions in a database through a word2Vec pre-training model to obtain vector representations of all model description keywords, wherein the vector representations are used for normalizing the keyword descriptions of the input retrieval text items;

3) Constructing semantic features of model description, constructing semantic feature description of each model by adopting IDF (inverse document frequency) for keyword description of all models in a database, calculating inverse document frequency of each keyword, and constructing the document frequency of the keyword which corresponds to each model as a vector which is the semantic feature of the model;

s2, inputting a two-dimensional sketch into a database to obtain a sketch retrieval result:

S202, inputting the input sketch features and the three-dimensional model features into a pseudo-twin neural network, and obtaining the similarity of the input sketch features and the three-dimensional model features as sketch retrieval results:

The similarity between the input sketch features and the three-dimensional model features is calculated as follows (1):

In the method, in the process of the invention, Representing input sketch features and three-dimensional model feature similarities,A contrast loss function is represented and is used,Represented as a sample feature 1,The characteristic is expressed as a sample characteristic 2, W is a neural network parameter, Y is expressed as whether samples are matched, and when Y is 1, the samples are matched; at 0, the samples do not match;

the contrast loss function is calculated according to the following formula (2):

Wherein N is the number of samples, D _W is the Euclidean distance between the sample X ₁ and the sample X ₂, m is the set distance upper threshold, and max (m-D _w, 0) is the maximum value of m-D _w, 0;

the euclidean distance is calculated according to the following formula (3):

In the method, in the process of the invention, the term "is expressed as the modulo length of the vector, X ₁ represents sample 1, and X ₂ represents sample 2;

s3, inputting voice into a database to obtain a voice retrieval result:

S304, obtaining an inner product corresponding to the three-dimensional model through the IDF weight corresponding to the standard keyword sequence and the IDF weight corresponding to the three-dimensional model in the database, and taking the inner product as the similarity between the output voice and the three-dimensional model in the pre-built database:

The IDF weight is calculated as follows (4):

Wherein W _tag represents weight, N _model represents three-dimensional model number, and N _tag represents label number corresponding to standard keyword sequence;

s4, obtaining a final search result through a voting mode based on the sketch search result and the voice search result:

In step 4, the two search modes are combined by voting, different voting weights are allocated to different models according to the ranking of the search results, weights are respectively allocated from the first to the last of the search results, the two search results are voted according to the weights to obtain total weights, and finally the final search results are ranked according to the total weights from high to low.

The above description is only specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions easily come within the scope of the present invention as those skilled in the art can easily come within the scope of the present invention defined by the appended claims.

Claims

1. A furniture model retrieval method based on voice and sketch is characterized by comprising the following steps:

s1, constructing a database based on a neural network, standard keyword description and model description semantic features, wherein the database comprises the following steps:

S104, constructing a database by constructing a pseudo twin neural network, standard keyword description and model description semantic features;

s3, inputting voice into a database to obtain a voice retrieval result;

2. The method for retrieving a furniture model based on voice and sketch according to claim 1, wherein S2 of inputting the two-dimensional sketch into a database to obtain a sketch retrieval result comprises:

3. The furniture model retrieval method based on voice and sketch according to claim 2, wherein the calculation of the similarity between the input sketch features and the three-dimensional model features in S202 is as follows formula (1):

4. A furniture model retrieval method based on speech and sketch according to claim 3, characterized in that the calculation of the contrast loss function is as follows:

5. The furniture model retrieval method based on voice and sketch according to claim 4, wherein the calculation of the euclidean distance is as follows:

In the formula, X ₁ represents sample 1, and X ₂ represents sample 2, expressed as a modulus of vector.

6. The furniture model retrieval method based on voice and sketch according to claim 1, wherein S3, obtaining a voice retrieval result by inputting voice into a pre-built database includes:

7. The method for retrieving a furniture model based on speech and sketch according to claim 6, wherein the calculation of the IDF weight in S304 is as follows:

8. The furniture model retrieval method based on voice and sketch according to claim 1, wherein S4 the obtaining the final retrieval result based on the sketch retrieval result and the voice retrieval result through voting comprises: