CN112182275A - Trademark approximate retrieval system and method based on multi-dimensional feature fusion - Google Patents

Trademark approximate retrieval system and method based on multi-dimensional feature fusion Download PDF

Info

Publication number
CN112182275A
CN112182275A CN202011046201.7A CN202011046201A CN112182275A CN 112182275 A CN112182275 A CN 112182275A CN 202011046201 A CN202011046201 A CN 202011046201A CN 112182275 A CN112182275 A CN 112182275A
Authority
CN
China
Prior art keywords
trademark
feature
module
neural network
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011046201.7A
Other languages
Chinese (zh)
Inventor
迟敬泽
尹乾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital China Information Systems Co ltd
Original Assignee
Digital China Information Systems Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital China Information Systems Co ltd filed Critical Digital China Information Systems Co ltd
Priority to CN202011046201.7A priority Critical patent/CN112182275A/en
Publication of CN112182275A publication Critical patent/CN112182275A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Library & Information Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a trademark approximate retrieval method with multi-dimensional feature fusion, which fuses three-dimensional features of a convolutional neural network, a visual word bag and a graphic element to realize the trademark approximate retrieval method with semantic information and identification capability. Firstly, extracting the characteristics of the multi-scale convolutional neural network based on the convolutional neural network, and carrying out optimization training on the characteristics of the multi-scale convolutional neural network based on a triplet loss function. Meanwhile, visual bag-of-words features of the trademark image are extracted, trademark graphic elements are combined, complementary information among multiple dimensionality features is considered, based on the features of three dimensionalities of the multi-scale convolutional neural network features, the visual bag-of-words features and the graphic element features, relevance complementarity among different dimensionality features is fully utilized and excavated, and therefore comprehensive trademark retrieval effect is improved.

Description

Trademark approximate retrieval system and method based on multi-dimensional feature fusion
Technical Field
The invention relates to the technical field of image search, in particular to a trademark approximate retrieval system and method based on multi-dimensional feature fusion.
Background
Trademarks are prominent marks that identify a good, service, or specific individual or business associated with it. Registered trademarks are a legitimate property that needs to be protected against trademark infringement. The trademark office undertakes the works of trademark examination, registration, administrative judgment and the like, and with the rapid development of economy, the number of enterprises is continuously increased, the registration amount of trademarks is continuously increased, and therefore the difficulty of the trademark office in examining, verifying and managing the trademarks is increased.
For the new trademark applied, the trademark office will audit the trademark, ensuring that the new trademark does not mimic and sufficiently differ from the registered trademark. At present, the trademark office carries out trademark checking method, which mainly searches trademarks through manually marked text information and graphic codes. The method has the advantages that the retrieval precision and efficiency are limited, and the workload of manual marking and auditing is large, so that the method faces important challenges for processing ever-increasing trademark registration applications.
With the development of computer image processing technology, a search method based on trademark image content itself is developed. The method does not depend on manually marked information, extracts corresponding image features from the trademark image, and performs similarity matching through the image features to further search out an approximate trademark. Most of the existing methods adopt a traditional image retrieval mode based on simple single image characteristics. The trademark image is usually composed of abstract figures and symbols, so that the method has strong abstraction and complexity, and semantic gaps exist between the representation of the trademark image and human cognition by a computer, so that the conventional method is difficult to understand the trademark image, and the accuracy and efficiency of trademark retrieval are influenced.
Therefore, aiming at the problems in trademark retrieval, the invention provides a trademark approximate retrieval method and a system with multi-dimensional feature fusion, which are used for fusing three-dimensional features of a convolutional neural network, a visual word bag and a graphic element, providing trademark approximate retrieval with semantic information and identification power and achieving better trademark retrieval effect.
Disclosure of Invention
The invention provides a trademark approximate retrieval system and method based on multi-dimensional feature fusion, aiming at solving the problems that when single image features are adopted in the prior art, because a trademark image is usually composed of abstract figures and symbols and has strong abstraction and complexity, and a computer has semantic gap between the representation of the trademark image and human cognition, the existing method is difficult to realize the understanding of the trademark image, and further the accuracy and efficiency of trademark retrieval are influenced.
The invention provides a trademark approximate retrieval system based on multi-dimensional feature fusion, which comprises a trademark database, a feature retrieval module and a feature extraction module, wherein the trademark database is connected with the feature extraction module;
the feature extraction module comprises a neural network feature module, a visual word bag feature module and a graphic element module, the neural network feature module is connected with the trademark database and the feature retrieval module, the visual word bag feature module is connected with the trademark database and the feature retrieval module, the pattern element module is connected with the trademark database, the neural network feature module is used for extracting image multi-scale convolution neural network features through the trademark database, carrying out optimization training on the neural network based on a triple measurement loss function and outputting the multi-scale convolution neural network features to the feature retrieval module, the visual word bag feature module is used for constructing a visual dictionary according to the extracted image key point features through the trademark database and outputting the image visual word bag features extracted based on the visual dictionary to the feature retrieval module, the graphic element module is used for establishing an index library of the registered trademark graphic elements and manually inputting the characteristics of the graphic elements to the characteristic retrieval module through a query staff.
A neural network feature module: the module is responsible for extracting the multi-scale convolution neural network characteristics of the image and carrying out optimization training on the neural network based on the triple measurement loss function;
visual word bag characteristic module: the module is responsible for extracting image key point features, constructing a visual dictionary and finally outputting image visual word bag features extracted based on the visual dictionary;
a graphic element feature module: the module is responsible for establishing an index library of registered trademark graphic elements through a full-text search engine, and inputting the graphic element characteristics of the trademark to be registered by a query staff through manual operation;
a characteristic retrieval module: the module is responsible for calling the characteristic module to extract the multi-scale convolution neural network characteristic, the visual word bag characteristic and the graphic element characteristic of the trademark image to be registered in the retrieval stage, calculating the similarity of the trademark to be registered and the trademarks in the registered trademark library, and returning a retrieval result by sequencing the similarity.
The invention relates to a trademark approximate retrieval system based on multi-dimensional feature fusion, which is characterized in that a neural network feature module comprises a residual error network, a first full connection layer, a second full connection layer, a first convolution layer, a second convolution layer, a first pooling layer and a second pooling layer, wherein the residual error network, the first convolution layer and the second convolution layer are all connected with a trademark database, the residual error network data is connected with the first full connection layer, the first convolution layer is connected with the first pooling layer, the second convolution layer data is connected with the second pooling layer, the first full connection layer, the first pooling layer and the second pooling layer are all connected with the second full connection layer, and the second full connection layer transmits multi-scale convolution neural network features to a feature retrieval module.
The invention provides a trademark approximate retrieval method based on multi-dimensional feature fusion, which comprises the following steps of:
s1, establishing a trademark database;
s2, inputting a trademark to be registered through a characteristic retrieval module;
s3, according to a trademark to be registered, the neural network feature module extracts multi-scale convolution neural network features of the trademark image through the trademark database and transmits the features to the feature retrieval module, the visual bag feature module extracts visual bag features of the trademark image through the trademark database and transmits the visual bag features to the feature retrieval module, and the graphic element module extracts graphic element features of the trademark image through the trademark database and transmits the graphic element features to the feature retrieval module;
and S4, performing similarity matching on the trademark to be registered and trademarks in a registered trademark library based on the multi-scale convolutional neural network characteristics, the visual word bag characteristics and the graphic element characteristics to obtain a fusion retrieval result of the three dimensional characteristics.
The invention discloses a trademark approximate retrieval method based on multi-dimensional feature fusion, which is a preferable mode, similarity matching is carried out on a trademark to be registered and trademarks in a registered trademark library in step S4, and a specific calculation formula for obtaining a fusion retrieval result of three dimensional features is as follows:
Score(a,di)
=α*ScoreCNN(a,di)+β*ScoreBoVW(a,di)+Scoreelement(a,di)
wherein Score (a, d)i) For the ith trademark d in the trademark a to be registered and the registered trademark libraryiTotal similarity of (1), ScoreCNNSimilarity, Score, for multi-scale convolutional neural network featuresBoVWSimilarity, Score, which is a feature of visual bag of wordselementAlpha and beta are weight parameters for similarity of the graphic element features.
The invention discloses a trademark approximate retrieval method based on multi-dimensional feature fusion, which is characterized in that as an optimal mode, a specific mode that a neural network feature module extracts multi-scale convolution neural network features of a trademark image and transmits the neural network features to a feature retrieval module in the step S3 is as follows:
s311, inputting the trademark image into a residual error network, a first convolution layer and a second convolution layer respectively, initializing the residual error network by ImageNet pre-training parameters, and extracting an Average Pooling layer in the residual error network to obtain a first characteristic through a first full-connection layer;
s312, the first convolution layer and the second convolution layer adopt different Padding values and Stride values, and a second characteristic and a third characteristic are obtained through the first pooling layer and the second pooling layer respectively;
s313, the first feature, the second feature and the third feature all pass through L2Regularization;
s314, splicing and inputting the regularized first feature, second feature and third feature into a second full-connection layer;
and S315, obtaining the multi-scale convolutional neural network characteristics through linear mapping by the second full-connection layer.
The invention discloses a trademark approximate retrieval method based on multi-dimensional feature fusion, which is characterized in that as an optimal mode, the step S3 of extracting the multi-scale convolution neural network features of a trademark image by a neural network feature module and transmitting the neural network features to a feature retrieval module further comprises the following steps:
s316, optimizing the multi-scale convolutional neural network parameters through a triple measurement loss function, and mining and learning semantic information of the registered trademark image;
the loss function is expressed as:
Loss(p,q,r)
=max(0,margin+D(f(q),f(p))-D(f(q),D(f(n))))
wherein Loss (p, q, r) is a metric Loss function of the triple (p, q, r), p, q, r are respectively a current trademark, an approximate trademark of the current trademark and a non-approximate trademark of the current trademark, f (·) is an output feature of the multi-scale convolutional neural network, D (·) is a cosine distance between two input features, and margin is a boundary parameter.
The invention discloses a trademark approximate retrieval method based on multi-dimensional feature fusion, which is a preferable mode, wherein the specific mode that a visual bag-of-words feature module extracts visual bag-of-words features of a trademark image and transmits the visual bag-of-words features to a feature retrieval module in the step S3 is as follows:
s321, extracting key points of the image by using three local detectors of Harris, Hessian and Kaze;
s322, extracting key point characteristics of the image key points through a Sift descriptor;
s323, establishing a visual dictionary through Kmeans clustering of the key point characteristics;
and S324, extracting visual word bag characteristics from the image through a visual dictionary.
On one hand, the method is based on the convolutional neural network, extracts the multi-scale convolutional neural network characteristics, and further models the semantic information of the trademark by performing optimization training on the multi-scale convolutional neural network characteristics based on the triple loss function. Meanwhile, the invention considers that a large amount of complementary information is contained among various dimensional characteristics, and based on the characteristics of three dimensions, namely the multi-scale convolutional neural network characteristics, the visual word bag characteristics and the graphic element characteristics, the relevance complementarity among different dimensional characteristics is fully utilized and mined, so that better retrieval accuracy can be obtained.
The invention has the following beneficial effects:
(1) in the multi-scale convolutional neural network, a residual network emphasizes the learning of high-level semantic features, the other two paths of shallow networks emphasize the learning of content features of images, and meanwhile, the optimization training based on a triple measurement loss function enables the multi-scale convolutional neural network features to better model the relative similarity of trademarks, so that a better retrieval effect is obtained;
(2) the local characteristics of the visual word bag side-duplication image content and the graphic coding represent high-level semantic understanding of a person on the trademark, and the integration of three dimensional characteristics, namely the multi-scale convolution nerve characteristics, the visual word bag characteristics and the graphic coding characteristics, enables advantages among different dimensional characteristics to be complemented, and improves the comprehensive trademark retrieval effect.
Drawings
FIG. 1 is a schematic diagram of a trademark approximate retrieval system based on multi-dimensional feature fusion;
FIG. 2 is a schematic diagram of a feature extraction module of a trademark approximate retrieval system based on multi-dimensional feature fusion;
FIG. 3 is a schematic structural diagram of a neural network feature module of a trademark approximation retrieval system based on multi-dimensional feature fusion;
FIG. 4 is a flowchart of a trademark approximation retrieval method based on multi-dimensional feature fusion.
Reference numerals:
1. a trademark database; 2. a feature retrieval module; 3. a feature extraction module; 31. a neural network feature module; 311. a residual network; 312. a first fully-connected layer; 313. a second fully connected layer; 314. a first winding layer; 315. a second convolutional layer; 316. a first pooling layer; 317. a second pooling layer; 32. a visual bag of words feature module; 33. and a graphic element module.
Detailed Description
The technical solutions in the embodiments of the present invention will be made clear below with reference to the accompanying drawings in the embodiments of the present invention.
Example 1
As shown in fig. 1, the device comprises a trademark database 1, a feature retrieval module 2 and a feature extraction module 3, wherein the trademark database 1 is connected with the feature extraction module 3, and the feature extraction module 3 is connected with the feature retrieval module 2.
As shown in fig. 2, the feature extraction module 3 includes a neural network feature module 31, a visual word bag feature module 32 and a graphic element module 33, the neural network feature module 31 is connected with the trademark database 1 and the feature retrieval module 2, the visual word bag feature module 32 is connected with the trademark database 1 and the feature retrieval module 2, the graphic element module 33 is connected with the trademark database 1, the neural network feature module 31 is used for extracting image multi-scale convolution neural network features through the trademark database 1, performing optimization training on the neural network based on a triple metric loss function and outputting the multi-scale convolution neural network features to the feature retrieval module 2, the visual word bag feature module 32 is used for constructing a visual dictionary according to the extracted image key point features and outputting the image visual word bag features extracted based on the visual dictionary to the feature retrieval module 2 through the trademark database 1, the graphic element module 33 is used to create an index library of registered trademark graphic elements and output the graphic element features to the feature retrieval module 2.
As shown in fig. 3, the neural network feature module 31 includes a residual network 311, a first fully-connected layer 312, a second fully-connected layer 313, a first convolutional layer 314, a second convolutional layer 315, a first pooling layer 316 and a second pooling layer 317, the residual network 311, the first convolutional layer 314 and the second convolutional layer 315 are all connected to the trademark database 1, the residual network 311 is data-connected to the first fully-connected layer 312, the first convolutional layer 314 is data-connected to the first pooling layer 316, the second convolutional layer 315 is data-connected to the second pooling layer 317, the first fully-connected layer 312, the first pooling layer 316 and the second pooling layer 317 are all connected to the second fully-connected layer 313, and the second fully-connected layer 313 transfers the multi-scale convolutional neural network feature to the feature retrieval module 2.
As shown in fig. 4, a trademark approximate retrieval method based on multi-dimensional feature fusion includes the following steps:
s1, establishing trademark database 1
Converting all registered trademark images and trademark images to be registered into gray images, eliminating the interference of colors on retrieval results, and uniformly storing all the images into a JPG format;
s2, inputting a trademark to be registered through the characteristic retrieval module 2;
s3, according to a trademark to be registered, the neural network feature module 31 extracts multi-scale convolution neural network features of the trademark image through the trademark database 1 and transmits the features to the feature retrieval module 2, the visual word bag feature module 32 extracts visual word bag features of the trademark image through the trademark database 1 and transmits the visual word bag features to the feature retrieval module 2, and the graphic element module 33 extracts graphic element features of the trademark image through the trademark database 1 and transmits the graphic element features to the feature retrieval module 2;
and S4, performing similarity matching on the trademark to be registered and trademarks in a registered trademark library based on the multi-scale convolutional neural network characteristics, the visual word bag characteristics and the graphic element characteristics to obtain a fusion retrieval result of the three dimensional characteristics.
The specific calculation formula for obtaining the fusion retrieval result of the three dimensional characteristics by similarity matching between the trademark to be registered and the trademarks in the registered trademark library is as follows:
Score(a,di)
=α*ScoreCNN(a,di)+β*ScoreBoVW(a,di)+Scoreelement(a,di)
wherein Score (a, d)i) For the ith trademark d in the trademark a to be registered and the registered trademark libraryiTotal similarity of (1), ScoreCNNSimilarity, Score, for multi-scale convolutional neural network featuresBoVWSimilarity, Score, which is a feature of visual bag of wordselementFor the similarity of the features of the graphic elements, alpha and beta are weight parameters, and in the implementation, the value of alpha is 0.6, and the value of beta is 0.4. Based on Score (a, d)i) And sorting the trademarks in the registered trademark library from big to small, and returning a sorting result, namely a retrieval result of the trademarks to be registered.
In the specific implementation, the accuracy of trademark retrieval is evaluated by adopting the MAP index. MAP is a common statistical indicator of search results, and is defined as follows:
Figure BDA0002708064620000081
where R is the total number of positive samples in the quotient library, RkIndicates the number of positive samples, rel, in the first k returned resultskAnd the result of the kth return is represented as whether the result is a positive sample or not, if the result is the positive sample, the result is 1, and if the result is not the positive sample, the result is 0. The above equation defines the AP value for a single query, and the MAP index is the average of all queried APs.
In this embodiment, the image is first unified to 224 × 224 pixel size. The first path adopts a structure that ResNet is connected with the first full connection layer 313, in the implementation, a ResNet101 network initialized by ImageNet data set (www.image-net.org) pre-training parameters is adopted, 2048-dimensional characteristics of an Average Pooling layer in the network are extracted and input into the first full connection layer 313. The first fully-connected layer 313 employs a linear mapping of the number of hidden neurons 4096. The other two paths adopt a structure of a convolution layer and a pooling layer, but Padding values of the convolution layer adopt 1 and 4 respectively, Stride values adopt 16 and 32 respectively, and the characteristic of 1536 dimension is obtained through the pooling layer. The characteristics obtained by the three networks pass through L2Regularization, splicing and inputting to a second full-connection layer 313, wherein the second full-connection layer 313 also adopts the linear mapping of the hidden neuron number 4096 to finally obtain the 4096-dimensional multi-scale convolution neural network characteristics.
The specific way in which the neural network feature module 31 extracts the multi-scale convolution neural network features of the trademark image and transmits the extracted multi-scale convolution neural network features to the feature retrieval module 2 in step S3 is as follows:
s311, inputting the trademark image into a residual error network 311, a first convolution layer 314 and a second convolution layer 315 respectively, and the residual error network 311 initialized by ImageNet pre-training parameters, and extracting a first feature of an Average Pooling layer in the residual error network 311 into a first full-connection layer 312;
s312, the first convolutional layer 314 and the second convolutional layer 315 obtain a second feature and a third feature through the first pooling layer 316 and the second pooling layer 317 respectively by using different Padding values and Stride values;
s313, the first feature, the second feature and the third feature all pass through L2Regularization;
s314, splicing and inputting the regularized first feature, second feature and third feature into a second full-connection layer 313;
and S315, obtaining the multi-scale convolutional neural network characteristics through linear mapping by the second fully-connected layer 313.
S316, optimizing the multi-scale convolutional neural network parameters through a triple measurement loss function, and mining and learning semantic information of the registered trademark image;
the loss function is expressed as:
Loss(p,q,r)
=max(0,margin+D(f(q),f(p))-D(f(q),D(f(n))))
wherein Loss (p, q, r) is a metric Loss function of the triple (p, q, r), p, q, r are respectively a current trademark, an approximate trademark of the current trademark and an approximate trademark of the current trademark, f (·) is an output feature of the multi-scale convolutional neural network, D (·) is a cosine distance between two input features, margin is a boundary parameter, and the value of margin in the implementation process is 0.1.
In the implementation process, the triples are constructed according to the existing labeling information, and the training is carried out in a mode of randomly selecting data samples in the training process. The training is optimized by adopting a random gradient descent algorithm, the initial learning rate is 0.001, the momentum parameter is 0.9, and the data volume of single training is 64.
The specific way in which the visual bag-of-words feature module 32 extracts the visual bag-of-words features of the trademark image and transmits the visual bag-of-words features to the feature retrieval module 2 in step S3 is as follows:
s321, extracting key points of the image by using three local detectors of Harris, Hessian and Kaze;
s322, extracting key point characteristics of the image key points through a Sift descriptor;
s323, establishing a visual dictionary through Kmeans clustering of the key point characteristics;
and S324, extracting visual word bag characteristics from the image through a visual dictionary.
In this embodiment, first, key points of an image are extracted using three local detectors, i.e., Harris, Hessian, and Kaze.
Based on the extracted image key points, key point features are respectively extracted through a Sift descriptor, and the feature dimension of each key point is 128.
And establishing a visual dictionary based on the Sift key point characteristics through a Kmeans clustering algorithm, wherein the size of the visual dictionary is 2000 words.
And finally, for each image, calculating the word frequency and the inverse document frequency based on the visual dictionary, and outputting 2000-dimensional visual word bag characteristics.
In this embodiment, an index library of registered trademark graphic elements is established, and the graphic element features of the trademark to be registered are manually input by an inquirer.
In the step, an index library is established for the graphic elements of the registered trademark by adopting full-text search engine libraries such as Lucene and Whoosh to support the quick retrieval of the graphic elements of the next step, and an index field of the index library comprises the registration number of the trademark and a graphic coding set contained in a trademark image.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (7)

1. A trademark approximate retrieval system based on multi-dimensional feature fusion is characterized in that: the trademark database (1) is connected with the feature extraction module (3), and the feature extraction module (3) is connected with the feature retrieval module (2);
the feature extraction module (3) comprises a neural network feature module (31), a visual word bag feature module (32) and a graphic element module (33), the neural network feature module (31) is connected with the trademark database (1) and the feature retrieval module (2), the visual word bag feature module (32) is connected with the trademark database (1) and the feature retrieval module (2), the graphic element module (33) is connected with the trademark database (1), the neural network feature module (31) is used for extracting image multi-scale convolution neural network features through the trademark database (1), carrying out optimization training on a neural network based on a triple metric loss function and outputting the multi-scale convolution neural network features to the feature retrieval module (2), and the visual word bag feature module (32) is used for constructing a visual dictionary and outputting a base dictionary according to the extracted image key point features through the trademark database (1) And the image visual bag-of-words features extracted from the visual dictionary are input into the feature retrieval module (2), and the graphic element module (33) is used for establishing an index library of graphic elements of registered trademarks and manually inputting the graphic element features into the feature retrieval module (2) by a query staff.
2. The trademark approximate retrieval system based on multi-dimensional feature fusion as claimed in claim 1, wherein: the neural network feature module (31) comprises a residual network (311), a first fully-connected layer (312), a second fully-connected layer (313), a first convolutional layer (314), a second convolutional layer (315), a first pooling layer (316), and a second pooling layer (317), the residual network (311), the first convolutional layer (314) and the second convolutional layer (315) are all connected with the trademark database (1), the residual network (311) being data connected to the first fully connected layer (312), the first convolutional layer (314) is in data connection with the first pooling layer (316), the second convolutional layer (315) is in data connection with the second pooling layer (317), the first fully connected layer (312), the first pooling layer (316), and the second pooling layer (317) are all connected to the second fully connected layer (313), the second fully-connected layer (313) passes the multi-scale convolutional neural network features to the feature retrieval module (2).
3. A trademark approximate retrieval method based on multi-dimensional feature fusion is characterized by comprising the following steps: the method comprises the following steps:
s1, establishing the trademark database (1);
s2, inputting a trademark to be registered through the characteristic retrieval module (2);
s3, according to the trademark to be registered, the neural network feature module (31) extracts multi-scale convolution neural network features of the trademark image through the trademark database (1) and transmits the multi-scale convolution neural network features to the feature retrieval module (2), the visual bag-of-words feature module (32) extracts visual bag-of-words features of the trademark image through the trademark database (1) and transmits the visual bag-of-words features to the feature retrieval module (2), and the graphic element module (33) extracts graphic element features of the trademark image through the trademark database (1) and transmits the graphic element features to the feature retrieval module (2);
and S4, performing similarity matching on the trademark to be registered and trademarks in a registered trademark library based on the multi-scale convolutional neural network characteristics, the visual word bag characteristics and the graphic element characteristics to obtain a fusion retrieval result of the three dimensional characteristics.
4. The trademark approximate retrieval method based on multi-dimensional feature fusion as claimed in claim 3, wherein: the specific calculation formula of the fusion search result in step S4 is:
Score(a,di)
=α*ScoreCNN(a,di)+β*ScoreBoVW(a,di)+Scoreelement(a,di)
wherein Score (a, d)i) For the ith trademark d in the trademark a to be registered and the registered trademark libraryiTotal similarity of (1), ScoreCNNSimilarity, Score, for multi-scale convolutional neural network featuresBoVWSimilarity, ScOre, for visual bag of words featureselementAlpha and beta are weight parameters for similarity of the graphic element features.
5. The trademark approximate retrieval method based on multi-dimensional feature fusion as claimed in claim 3, wherein: in the step S3, the specific manner of extracting the multi-scale convolution neural network feature of the trademark image by the neural network feature module (31) and transmitting the extracted multi-scale convolution neural network feature to the feature retrieval module (2) is as follows:
s311, inputting the trademark image into the residual error network (311), the first convolution layer (314) and the second convolution layer (315) respectively, and extracting an average pooling layer in the residual error network (311) to pass through the first full-connection layer (312) to obtain a first characteristic;
s312, the first convolution layer (314) and the second convolution layer (315) adopt different convolution operation parameters, and a second feature and a third feature are obtained through the first pooling layer (316) and the second pooling layer (317) respectively;
s313, the first feature, the second feature, and the third feature all pass through L2Regularization;
s314, splicing and inputting the first feature, the second feature and the third feature after regularization into the second full-connection layer (313);
s315, the second full-connection layer (313) obtains the multi-scale convolution neural network characteristics through linear mapping.
6. The trademark approximate retrieval method based on multi-dimensional feature fusion as claimed in claim 5, wherein: the step S3 in which the neural network feature module (31) extracts the neural network features of the trademark image and transmits the neural network features to the feature retrieval module (2) further includes:
s316, optimizing the multi-scale convolutional neural network parameters through a triple measurement loss function, and mining and learning semantic information of the registered trademark image;
the loss function is expressed as:
Loss(p,q,r)
=max(0,margin+D(f(q),f(p))-D(f(q),D(f(n))))
wherein Loss (p, q, r) is a metric Loss function of the triple (p, q, r), p, q, r are respectively a current trademark, an approximate trademark of the current trademark and a non-approximate trademark of the current trademark, f (·) is an output feature of the multi-scale convolutional neural network, D (·) is a cosine distance between two input features, and margin is a boundary parameter.
7. The trademark approximate retrieval method based on multi-dimensional feature fusion as claimed in claim 3, wherein: in the step S3, the specific manner in which the visual bag-of-words feature module (32) extracts the visual bag-of-words features of the trademark image and transmits the visual bag-of-words features to the feature retrieval module (2) is as follows:
s321, extracting key points of the image by using three local detectors of Harris, Hessian and Kaze;
s322, extracting key point characteristics of the image key points through a Sift descriptor;
s323, establishing a visual dictionary through Kmeans clustering of the key point characteristics;
and S324, extracting visual word bag characteristics from the image through a visual dictionary.
CN202011046201.7A 2020-09-29 2020-09-29 Trademark approximate retrieval system and method based on multi-dimensional feature fusion Pending CN112182275A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011046201.7A CN112182275A (en) 2020-09-29 2020-09-29 Trademark approximate retrieval system and method based on multi-dimensional feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011046201.7A CN112182275A (en) 2020-09-29 2020-09-29 Trademark approximate retrieval system and method based on multi-dimensional feature fusion

Publications (1)

Publication Number Publication Date
CN112182275A true CN112182275A (en) 2021-01-05

Family

ID=73945705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011046201.7A Pending CN112182275A (en) 2020-09-29 2020-09-29 Trademark approximate retrieval system and method based on multi-dimensional feature fusion

Country Status (1)

Country Link
CN (1) CN112182275A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884005A (en) * 2021-01-21 2021-06-01 汉唐信通(北京)科技有限公司 Image retrieval method and device based on SPTAG and convolutional neural network
CN113326926A (en) * 2021-06-30 2021-08-31 上海理工大学 Fully-connected Hash neural network for remote sensing image retrieval
CN114003790A (en) * 2021-12-30 2022-02-01 北京企名片科技有限公司 Data processing method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126581A (en) * 2016-06-20 2016-11-16 复旦大学 Cartographical sketching image search method based on degree of depth study
CN106919920A (en) * 2017-03-06 2017-07-04 重庆邮电大学 Scene recognition method based on convolution feature and spatial vision bag of words
CN109273054A (en) * 2018-08-31 2019-01-25 南京农业大学 Protein Subcellular interval prediction method based on relation map
CN110347853A (en) * 2019-07-09 2019-10-18 成都澳海川科技有限公司 A kind of image hash code generation method based on Recognition with Recurrent Neural Network
CN110852327A (en) * 2019-11-07 2020-02-28 首都师范大学 Image processing method, image processing device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126581A (en) * 2016-06-20 2016-11-16 复旦大学 Cartographical sketching image search method based on degree of depth study
CN106919920A (en) * 2017-03-06 2017-07-04 重庆邮电大学 Scene recognition method based on convolution feature and spatial vision bag of words
CN109273054A (en) * 2018-08-31 2019-01-25 南京农业大学 Protein Subcellular interval prediction method based on relation map
CN110347853A (en) * 2019-07-09 2019-10-18 成都澳海川科技有限公司 A kind of image hash code generation method based on Recognition with Recurrent Neural Network
CN110852327A (en) * 2019-11-07 2020-02-28 首都师范大学 Image processing method, image processing device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘东: "基于空间共生词袋模型与卷积神经网络的医学影像分类方法", 湘南学院学报, 25 April 2020 (2020-04-25), pages 1 - 3 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884005A (en) * 2021-01-21 2021-06-01 汉唐信通(北京)科技有限公司 Image retrieval method and device based on SPTAG and convolutional neural network
CN113326926A (en) * 2021-06-30 2021-08-31 上海理工大学 Fully-connected Hash neural network for remote sensing image retrieval
CN114003790A (en) * 2021-12-30 2022-02-01 北京企名片科技有限公司 Data processing method

Similar Documents

Publication Publication Date Title
Yuan et al. Exploring a fine-grained multiscale method for cross-modal remote sensing image retrieval
CN106202256B (en) Web image retrieval method based on semantic propagation and mixed multi-instance learning
Han et al. Deep spatiality: Unsupervised learning of spatially-enhanced global and local 3D features by deep neural network with coupled softmax
CN109299341A (en) One kind confrontation cross-module state search method dictionary-based learning and system
CN110222140A (en) A kind of cross-module state search method based on confrontation study and asymmetric Hash
CN112182275A (en) Trademark approximate retrieval system and method based on multi-dimensional feature fusion
CN113177141B (en) Multi-label video hash retrieval method and device based on semantic embedded soft similarity
CN110929080B (en) Optical remote sensing image retrieval method based on attention and generation countermeasure network
Zheng et al. Differential Learning: A Powerful Tool for Interactive Content-Based Image Retrieval.
CN103473327A (en) Image retrieval method and image retrieval system
CN116402063A (en) Multi-modal irony recognition method, apparatus, device and storage medium
US11914641B2 (en) Text to color palette generator
CN110969023B (en) Text similarity determination method and device
CN113239159A (en) Cross-modal retrieval method of videos and texts based on relational inference network
CN113537304A (en) Cross-modal semantic clustering method based on bidirectional CNN
CN115658934A (en) Image-text cross-modal retrieval method based on multi-class attention mechanism
CN116955707A (en) Content tag determination method, device, equipment, medium and program product
WO2023134085A1 (en) Question answer prediction method and prediction apparatus, electronic device, and storage medium
Wei et al. Food image classification and image retrieval based on visual features and machine learning
CN114662586A (en) Method for detecting false information based on common attention multi-mode fusion mechanism
CN114925702A (en) Text similarity recognition method and device, electronic equipment and storage medium
Wang et al. Block-based image matching for image retrieval
Meiyu et al. Image semantic description and automatic semantic annotation
CN108537855B (en) Ceramic stained paper pattern generation method and device with consistent sketch
CN115797795A (en) Remote sensing image question-answering type retrieval system and method based on reinforcement learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination