CN112182275A - Trademark approximate retrieval system and method based on multi-dimensional feature fusion - Google Patents

Trademark approximate retrieval system and method based on multi-dimensional feature fusion Download PDF

Info

Publication number
CN112182275A
CN112182275A CN202011046201.7A CN202011046201A CN112182275A CN 112182275 A CN112182275 A CN 112182275A CN 202011046201 A CN202011046201 A CN 202011046201A CN 112182275 A CN112182275 A CN 112182275A
Authority
CN
China
Prior art keywords
feature
trademark
module
neural network
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011046201.7A
Other languages
Chinese (zh)
Other versions
CN112182275B (en
Inventor
迟敬泽
尹乾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital China Information Systems Co ltd
Original Assignee
Digital China Information Systems Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital China Information Systems Co ltd filed Critical Digital China Information Systems Co ltd
Priority to CN202011046201.7A priority Critical patent/CN112182275B/en
Publication of CN112182275A publication Critical patent/CN112182275A/en
Application granted granted Critical
Publication of CN112182275B publication Critical patent/CN112182275B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Library & Information Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供了一种多维度特征融合的商标近似检索方法,将卷积神经网络、视觉词袋、图形要素三个维度的特征进行融合,实现更具语义信息和辨识力的商标近似检索方法。首先基于卷积神经网络,提取多尺度卷积神经网络特征,并基于三元组损失函数对多尺度卷积神经网络特征进行优化训练。同时,提取商标图像视觉词袋特征,并结合商标图形要素,考虑多种维度特征之间的互补信息,基于多尺度卷积神经网络特征、视觉词袋特征和图形要素特征三个维度的特征,充分利用并挖掘不同维度特征间的关联互补性,从而提升商标综合检索效果。

Figure 202011046201

The invention provides a trademark approximate retrieval method of multi-dimensional feature fusion, which integrates the three-dimensional features of a convolutional neural network, a visual word bag and a graphic element to realize a trademark approximate retrieval method with more semantic information and identification. Firstly, based on the convolutional neural network, the multi-scale convolutional neural network features are extracted, and the multi-scale convolutional neural network features are optimized and trained based on the triplet loss function. At the same time, the visual word bag features of the trademark image are extracted, combined with the trademark graphic elements, and the complementary information between the multi-dimensional features is considered. Based on the three-dimensional features of multi-scale convolutional neural network features, visual word bag features and graphic element features, Make full use of and tap the correlation and complementarity between features of different dimensions, so as to improve the comprehensive retrieval effect of trademarks.

Figure 202011046201

Description

Trademark approximate retrieval system and method based on multi-dimensional feature fusion
Technical Field
The invention relates to the technical field of image search, in particular to a trademark approximate retrieval system and method based on multi-dimensional feature fusion.
Background
Trademarks are prominent marks that identify a good, service, or specific individual or business associated with it. Registered trademarks are a legitimate property that needs to be protected against trademark infringement. The trademark office undertakes the works of trademark examination, registration, administrative judgment and the like, and with the rapid development of economy, the number of enterprises is continuously increased, the registration amount of trademarks is continuously increased, and therefore the difficulty of the trademark office in examining, verifying and managing the trademarks is increased.
For the new trademark applied, the trademark office will audit the trademark, ensuring that the new trademark does not mimic and sufficiently differ from the registered trademark. At present, the trademark office carries out trademark checking method, which mainly searches trademarks through manually marked text information and graphic codes. The method has the advantages that the retrieval precision and efficiency are limited, and the workload of manual marking and auditing is large, so that the method faces important challenges for processing ever-increasing trademark registration applications.
With the development of computer image processing technology, a search method based on trademark image content itself is developed. The method does not depend on manually marked information, extracts corresponding image features from the trademark image, and performs similarity matching through the image features to further search out an approximate trademark. Most of the existing methods adopt a traditional image retrieval mode based on simple single image characteristics. The trademark image is usually composed of abstract figures and symbols, so that the method has strong abstraction and complexity, and semantic gaps exist between the representation of the trademark image and human cognition by a computer, so that the conventional method is difficult to understand the trademark image, and the accuracy and efficiency of trademark retrieval are influenced.
Therefore, aiming at the problems in trademark retrieval, the invention provides a trademark approximate retrieval method and a system with multi-dimensional feature fusion, which are used for fusing three-dimensional features of a convolutional neural network, a visual word bag and a graphic element, providing trademark approximate retrieval with semantic information and identification power and achieving better trademark retrieval effect.
Disclosure of Invention
The invention provides a trademark approximate retrieval system and method based on multi-dimensional feature fusion, aiming at solving the problems that when single image features are adopted in the prior art, because a trademark image is usually composed of abstract figures and symbols and has strong abstraction and complexity, and a computer has semantic gap between the representation of the trademark image and human cognition, the existing method is difficult to realize the understanding of the trademark image, and further the accuracy and efficiency of trademark retrieval are influenced.
The invention provides a trademark approximate retrieval system based on multi-dimensional feature fusion, which comprises a trademark database, a feature retrieval module and a feature extraction module, wherein the trademark database is connected with the feature extraction module;
the feature extraction module comprises a neural network feature module, a visual word bag feature module and a graphic element module, the neural network feature module is connected with the trademark database and the feature retrieval module, the visual word bag feature module is connected with the trademark database and the feature retrieval module, the pattern element module is connected with the trademark database, the neural network feature module is used for extracting image multi-scale convolution neural network features through the trademark database, carrying out optimization training on the neural network based on a triple measurement loss function and outputting the multi-scale convolution neural network features to the feature retrieval module, the visual word bag feature module is used for constructing a visual dictionary according to the extracted image key point features through the trademark database and outputting the image visual word bag features extracted based on the visual dictionary to the feature retrieval module, the graphic element module is used for establishing an index library of the registered trademark graphic elements and manually inputting the characteristics of the graphic elements to the characteristic retrieval module through a query staff.
A neural network feature module: the module is responsible for extracting the multi-scale convolution neural network characteristics of the image and carrying out optimization training on the neural network based on the triple measurement loss function;
visual word bag characteristic module: the module is responsible for extracting image key point features, constructing a visual dictionary and finally outputting image visual word bag features extracted based on the visual dictionary;
a graphic element feature module: the module is responsible for establishing an index library of registered trademark graphic elements through a full-text search engine, and inputting the graphic element characteristics of the trademark to be registered by a query staff through manual operation;
a characteristic retrieval module: the module is responsible for calling the characteristic module to extract the multi-scale convolution neural network characteristic, the visual word bag characteristic and the graphic element characteristic of the trademark image to be registered in the retrieval stage, calculating the similarity of the trademark to be registered and the trademarks in the registered trademark library, and returning a retrieval result by sequencing the similarity.
The invention relates to a trademark approximate retrieval system based on multi-dimensional feature fusion, which is characterized in that a neural network feature module comprises a residual error network, a first full connection layer, a second full connection layer, a first convolution layer, a second convolution layer, a first pooling layer and a second pooling layer, wherein the residual error network, the first convolution layer and the second convolution layer are all connected with a trademark database, the residual error network data is connected with the first full connection layer, the first convolution layer is connected with the first pooling layer, the second convolution layer data is connected with the second pooling layer, the first full connection layer, the first pooling layer and the second pooling layer are all connected with the second full connection layer, and the second full connection layer transmits multi-scale convolution neural network features to a feature retrieval module.
The invention provides a trademark approximate retrieval method based on multi-dimensional feature fusion, which comprises the following steps of:
s1, establishing a trademark database;
s2, inputting a trademark to be registered through a characteristic retrieval module;
s3, according to a trademark to be registered, the neural network feature module extracts multi-scale convolution neural network features of the trademark image through the trademark database and transmits the features to the feature retrieval module, the visual bag feature module extracts visual bag features of the trademark image through the trademark database and transmits the visual bag features to the feature retrieval module, and the graphic element module extracts graphic element features of the trademark image through the trademark database and transmits the graphic element features to the feature retrieval module;
and S4, performing similarity matching on the trademark to be registered and trademarks in a registered trademark library based on the multi-scale convolutional neural network characteristics, the visual word bag characteristics and the graphic element characteristics to obtain a fusion retrieval result of the three dimensional characteristics.
The invention discloses a trademark approximate retrieval method based on multi-dimensional feature fusion, which is a preferable mode, similarity matching is carried out on a trademark to be registered and trademarks in a registered trademark library in step S4, and a specific calculation formula for obtaining a fusion retrieval result of three dimensional features is as follows:
Score(a,di)
=α*ScoreCNN(a,di)+β*ScoreBoVW(a,di)+Scoreelement(a,di)
wherein Score (a, d)i) For the ith trademark d in the trademark a to be registered and the registered trademark libraryiTotal similarity of (1), ScoreCNNSimilarity, Score, for multi-scale convolutional neural network featuresBoVWSimilarity, Score, which is a feature of visual bag of wordselementAlpha and beta are weight parameters for similarity of the graphic element features.
The invention discloses a trademark approximate retrieval method based on multi-dimensional feature fusion, which is characterized in that as an optimal mode, a specific mode that a neural network feature module extracts multi-scale convolution neural network features of a trademark image and transmits the neural network features to a feature retrieval module in the step S3 is as follows:
s311, inputting the trademark image into a residual error network, a first convolution layer and a second convolution layer respectively, initializing the residual error network by ImageNet pre-training parameters, and extracting an Average Pooling layer in the residual error network to obtain a first characteristic through a first full-connection layer;
s312, the first convolution layer and the second convolution layer adopt different Padding values and Stride values, and a second characteristic and a third characteristic are obtained through the first pooling layer and the second pooling layer respectively;
s313, the first feature, the second feature and the third feature all pass through L2Regularization;
s314, splicing and inputting the regularized first feature, second feature and third feature into a second full-connection layer;
and S315, obtaining the multi-scale convolutional neural network characteristics through linear mapping by the second full-connection layer.
The invention discloses a trademark approximate retrieval method based on multi-dimensional feature fusion, which is characterized in that as an optimal mode, the step S3 of extracting the multi-scale convolution neural network features of a trademark image by a neural network feature module and transmitting the neural network features to a feature retrieval module further comprises the following steps:
s316, optimizing the multi-scale convolutional neural network parameters through a triple measurement loss function, and mining and learning semantic information of the registered trademark image;
the loss function is expressed as:
Loss(p,q,r)
=max(0,margin+D(f(q),f(p))-D(f(q),D(f(n))))
wherein Loss (p, q, r) is a metric Loss function of the triple (p, q, r), p, q, r are respectively a current trademark, an approximate trademark of the current trademark and a non-approximate trademark of the current trademark, f (·) is an output feature of the multi-scale convolutional neural network, D (·) is a cosine distance between two input features, and margin is a boundary parameter.
The invention discloses a trademark approximate retrieval method based on multi-dimensional feature fusion, which is a preferable mode, wherein the specific mode that a visual bag-of-words feature module extracts visual bag-of-words features of a trademark image and transmits the visual bag-of-words features to a feature retrieval module in the step S3 is as follows:
s321, extracting key points of the image by using three local detectors of Harris, Hessian and Kaze;
s322, extracting key point characteristics of the image key points through a Sift descriptor;
s323, establishing a visual dictionary through Kmeans clustering of the key point characteristics;
and S324, extracting visual word bag characteristics from the image through a visual dictionary.
On one hand, the method is based on the convolutional neural network, extracts the multi-scale convolutional neural network characteristics, and further models the semantic information of the trademark by performing optimization training on the multi-scale convolutional neural network characteristics based on the triple loss function. Meanwhile, the invention considers that a large amount of complementary information is contained among various dimensional characteristics, and based on the characteristics of three dimensions, namely the multi-scale convolutional neural network characteristics, the visual word bag characteristics and the graphic element characteristics, the relevance complementarity among different dimensional characteristics is fully utilized and mined, so that better retrieval accuracy can be obtained.
The invention has the following beneficial effects:
(1) in the multi-scale convolutional neural network, a residual network emphasizes the learning of high-level semantic features, the other two paths of shallow networks emphasize the learning of content features of images, and meanwhile, the optimization training based on a triple measurement loss function enables the multi-scale convolutional neural network features to better model the relative similarity of trademarks, so that a better retrieval effect is obtained;
(2) the local characteristics of the visual word bag side-duplication image content and the graphic coding represent high-level semantic understanding of a person on the trademark, and the integration of three dimensional characteristics, namely the multi-scale convolution nerve characteristics, the visual word bag characteristics and the graphic coding characteristics, enables advantages among different dimensional characteristics to be complemented, and improves the comprehensive trademark retrieval effect.
Drawings
FIG. 1 is a schematic diagram of a trademark approximate retrieval system based on multi-dimensional feature fusion;
FIG. 2 is a schematic diagram of a feature extraction module of a trademark approximate retrieval system based on multi-dimensional feature fusion;
FIG. 3 is a schematic structural diagram of a neural network feature module of a trademark approximation retrieval system based on multi-dimensional feature fusion;
FIG. 4 is a flowchart of a trademark approximation retrieval method based on multi-dimensional feature fusion.
Reference numerals:
1. a trademark database; 2. a feature retrieval module; 3. a feature extraction module; 31. a neural network feature module; 311. a residual network; 312. a first fully-connected layer; 313. a second fully connected layer; 314. a first winding layer; 315. a second convolutional layer; 316. a first pooling layer; 317. a second pooling layer; 32. a visual bag of words feature module; 33. and a graphic element module.
Detailed Description
The technical solutions in the embodiments of the present invention will be made clear below with reference to the accompanying drawings in the embodiments of the present invention.
Example 1
As shown in fig. 1, the device comprises a trademark database 1, a feature retrieval module 2 and a feature extraction module 3, wherein the trademark database 1 is connected with the feature extraction module 3, and the feature extraction module 3 is connected with the feature retrieval module 2.
As shown in fig. 2, the feature extraction module 3 includes a neural network feature module 31, a visual word bag feature module 32 and a graphic element module 33, the neural network feature module 31 is connected with the trademark database 1 and the feature retrieval module 2, the visual word bag feature module 32 is connected with the trademark database 1 and the feature retrieval module 2, the graphic element module 33 is connected with the trademark database 1, the neural network feature module 31 is used for extracting image multi-scale convolution neural network features through the trademark database 1, performing optimization training on the neural network based on a triple metric loss function and outputting the multi-scale convolution neural network features to the feature retrieval module 2, the visual word bag feature module 32 is used for constructing a visual dictionary according to the extracted image key point features and outputting the image visual word bag features extracted based on the visual dictionary to the feature retrieval module 2 through the trademark database 1, the graphic element module 33 is used to create an index library of registered trademark graphic elements and output the graphic element features to the feature retrieval module 2.
As shown in fig. 3, the neural network feature module 31 includes a residual network 311, a first fully-connected layer 312, a second fully-connected layer 313, a first convolutional layer 314, a second convolutional layer 315, a first pooling layer 316 and a second pooling layer 317, the residual network 311, the first convolutional layer 314 and the second convolutional layer 315 are all connected to the trademark database 1, the residual network 311 is data-connected to the first fully-connected layer 312, the first convolutional layer 314 is data-connected to the first pooling layer 316, the second convolutional layer 315 is data-connected to the second pooling layer 317, the first fully-connected layer 312, the first pooling layer 316 and the second pooling layer 317 are all connected to the second fully-connected layer 313, and the second fully-connected layer 313 transfers the multi-scale convolutional neural network feature to the feature retrieval module 2.
As shown in fig. 4, a trademark approximate retrieval method based on multi-dimensional feature fusion includes the following steps:
s1, establishing trademark database 1
Converting all registered trademark images and trademark images to be registered into gray images, eliminating the interference of colors on retrieval results, and uniformly storing all the images into a JPG format;
s2, inputting a trademark to be registered through the characteristic retrieval module 2;
s3, according to a trademark to be registered, the neural network feature module 31 extracts multi-scale convolution neural network features of the trademark image through the trademark database 1 and transmits the features to the feature retrieval module 2, the visual word bag feature module 32 extracts visual word bag features of the trademark image through the trademark database 1 and transmits the visual word bag features to the feature retrieval module 2, and the graphic element module 33 extracts graphic element features of the trademark image through the trademark database 1 and transmits the graphic element features to the feature retrieval module 2;
and S4, performing similarity matching on the trademark to be registered and trademarks in a registered trademark library based on the multi-scale convolutional neural network characteristics, the visual word bag characteristics and the graphic element characteristics to obtain a fusion retrieval result of the three dimensional characteristics.
The specific calculation formula for obtaining the fusion retrieval result of the three dimensional characteristics by similarity matching between the trademark to be registered and the trademarks in the registered trademark library is as follows:
Score(a,di)
=α*ScoreCNN(a,di)+β*ScoreBoVW(a,di)+Scoreelement(a,di)
wherein Score (a, d)i) For the ith trademark d in the trademark a to be registered and the registered trademark libraryiTotal similarity of (1), ScoreCNNSimilarity, Score, for multi-scale convolutional neural network featuresBoVWSimilarity, Score, which is a feature of visual bag of wordselementFor the similarity of the features of the graphic elements, alpha and beta are weight parameters, and in the implementation, the value of alpha is 0.6, and the value of beta is 0.4. Based on Score (a, d)i) And sorting the trademarks in the registered trademark library from big to small, and returning a sorting result, namely a retrieval result of the trademarks to be registered.
In the specific implementation, the accuracy of trademark retrieval is evaluated by adopting the MAP index. MAP is a common statistical indicator of search results, and is defined as follows:
Figure BDA0002708064620000081
where R is the total number of positive samples in the quotient library, RkIndicates the number of positive samples, rel, in the first k returned resultskAnd the result of the kth return is represented as whether the result is a positive sample or not, if the result is the positive sample, the result is 1, and if the result is not the positive sample, the result is 0. The above equation defines the AP value for a single query, and the MAP index is the average of all queried APs.
In this embodiment, the image is first unified to 224 × 224 pixel size. The first path adopts a structure that ResNet is connected with the first full connection layer 313, in the implementation, a ResNet101 network initialized by ImageNet data set (www.image-net.org) pre-training parameters is adopted, 2048-dimensional characteristics of an Average Pooling layer in the network are extracted and input into the first full connection layer 313. The first fully-connected layer 313 employs a linear mapping of the number of hidden neurons 4096. The other two paths adopt a structure of a convolution layer and a pooling layer, but Padding values of the convolution layer adopt 1 and 4 respectively, Stride values adopt 16 and 32 respectively, and the characteristic of 1536 dimension is obtained through the pooling layer. The characteristics obtained by the three networks pass through L2Regularization, splicing and inputting to a second full-connection layer 313, wherein the second full-connection layer 313 also adopts the linear mapping of the hidden neuron number 4096 to finally obtain the 4096-dimensional multi-scale convolution neural network characteristics.
The specific way in which the neural network feature module 31 extracts the multi-scale convolution neural network features of the trademark image and transmits the extracted multi-scale convolution neural network features to the feature retrieval module 2 in step S3 is as follows:
s311, inputting the trademark image into a residual error network 311, a first convolution layer 314 and a second convolution layer 315 respectively, and the residual error network 311 initialized by ImageNet pre-training parameters, and extracting a first feature of an Average Pooling layer in the residual error network 311 into a first full-connection layer 312;
s312, the first convolutional layer 314 and the second convolutional layer 315 obtain a second feature and a third feature through the first pooling layer 316 and the second pooling layer 317 respectively by using different Padding values and Stride values;
s313, the first feature, the second feature and the third feature all pass through L2Regularization;
s314, splicing and inputting the regularized first feature, second feature and third feature into a second full-connection layer 313;
and S315, obtaining the multi-scale convolutional neural network characteristics through linear mapping by the second fully-connected layer 313.
S316, optimizing the multi-scale convolutional neural network parameters through a triple measurement loss function, and mining and learning semantic information of the registered trademark image;
the loss function is expressed as:
Loss(p,q,r)
=max(0,margin+D(f(q),f(p))-D(f(q),D(f(n))))
wherein Loss (p, q, r) is a metric Loss function of the triple (p, q, r), p, q, r are respectively a current trademark, an approximate trademark of the current trademark and an approximate trademark of the current trademark, f (·) is an output feature of the multi-scale convolutional neural network, D (·) is a cosine distance between two input features, margin is a boundary parameter, and the value of margin in the implementation process is 0.1.
In the implementation process, the triples are constructed according to the existing labeling information, and the training is carried out in a mode of randomly selecting data samples in the training process. The training is optimized by adopting a random gradient descent algorithm, the initial learning rate is 0.001, the momentum parameter is 0.9, and the data volume of single training is 64.
The specific way in which the visual bag-of-words feature module 32 extracts the visual bag-of-words features of the trademark image and transmits the visual bag-of-words features to the feature retrieval module 2 in step S3 is as follows:
s321, extracting key points of the image by using three local detectors of Harris, Hessian and Kaze;
s322, extracting key point characteristics of the image key points through a Sift descriptor;
s323, establishing a visual dictionary through Kmeans clustering of the key point characteristics;
and S324, extracting visual word bag characteristics from the image through a visual dictionary.
In this embodiment, first, key points of an image are extracted using three local detectors, i.e., Harris, Hessian, and Kaze.
Based on the extracted image key points, key point features are respectively extracted through a Sift descriptor, and the feature dimension of each key point is 128.
And establishing a visual dictionary based on the Sift key point characteristics through a Kmeans clustering algorithm, wherein the size of the visual dictionary is 2000 words.
And finally, for each image, calculating the word frequency and the inverse document frequency based on the visual dictionary, and outputting 2000-dimensional visual word bag characteristics.
In this embodiment, an index library of registered trademark graphic elements is established, and the graphic element features of the trademark to be registered are manually input by an inquirer.
In the step, an index library is established for the graphic elements of the registered trademark by adopting full-text search engine libraries such as Lucene and Whoosh to support the quick retrieval of the graphic elements of the next step, and an index field of the index library comprises the registration number of the trademark and a graphic coding set contained in a trademark image.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (7)

1.一种基于多维度特征融合的商标近似检索系统,其特征在于:包括商标数据库(1)、特征检索模块(2)、特征提取模块(3),所述商标数据库(1)连接所述特征提取模块(3),所述特征提取模块(3)连接所述特征检索模块(2);1. A trademark similarity retrieval system based on multi-dimensional feature fusion is characterized in that: comprising a trademark database (1), a feature retrieval module (2), a feature extraction module (3), and the trademark database (1) connects the a feature extraction module (3), the feature extraction module (3) is connected to the feature retrieval module (2); 所述特征提取模块(3)包括神经网络特征模块(31)、视觉词袋特征模块(32)和图形要素模块(33),所述神经网络特征模块(31)连接所述商标数据库(1)、所述特征检索模块(2),所述视觉词袋特征模块(32)连接所述商标数据库(1)和所述特征检索模块(2),所述图形要素模块(33)连接所述商标数据库(1),所述神经网络特征模块(31)用于通过所述商标数据库(1)提取图像多尺度卷积神经网络特征并基于三元组度量损失函数对神经网络进行优化训练并输出多尺度卷积神经网络特征至所述特征检索模块(2),所述视觉词袋特征模块(32)用于通过所述商标数据库(1)根据提取的图像关键点特征并构建视觉词典并输出基于所述视觉词典提取的图像视觉词袋特征至所述特征检索模块(2),所述图形要素模块(33)用于建立已注册商标图形要素的索引库并通过查询人员手工输入图形要素特征至所述特征检索模块(2)。The feature extraction module (3) includes a neural network feature module (31), a visual word bag feature module (32) and a graphic element module (33), and the neural network feature module (31) is connected to the trademark database (1) , the feature retrieval module (2), the visual word bag feature module (32) is connected to the trademark database (1) and the feature retrieval module (2), the graphic element module (33) is connected to the trademark Database (1), the neural network feature module (31) is used to extract image multi-scale convolutional neural network features through the trademark database (1), and optimize the training of the neural network based on the triplet metric loss function and output multiple data. The scale convolutional neural network features are sent to the feature retrieval module (2), and the visual word bag feature module (32) is used to construct a visual dictionary based on the extracted image key point features through the trademark database (1). The image visual word bag features extracted by the visual dictionary are sent to the feature retrieval module (2), and the graphic element module (33) is used to establish an index library of registered trademark graphic elements and manually input the graphic element features to the The feature retrieval module (2). 2.根据权利要求1所述的一种基于多维度特征融合的商标近似检索系统,其特征在于:所述神经网络特征模块(31)包括残差网络(311)、第一全连接层(312)、第二全连接层(313)、第一卷积层(314)、第二卷积层(315)、第一池化层(316)和第二池化层(317),所述残差网络(311)、所述第一卷积层(314)和所述第二卷积层(315)均连接所述商标数据库(1),所述残差网络(311)数据连接所述第一全连接层(312),所述第一卷积层(314)数据连接所述第一池化层(316),所述第二卷积层(315)数据连接所述第二池化层(317),所述第一全连接层(312)、所述第一池化层(316)和所述第二池化层(317)均连接所述第二全连接层(313),所述第二全连接层(313)将多尺度卷积神经网络特征传递至所述特征检索模块(2)。2. A trademark approximate retrieval system based on multi-dimensional feature fusion according to claim 1, wherein the neural network feature module (31) comprises a residual network (311), a first fully connected layer (312) ), the second fully connected layer (313), the first convolutional layer (314), the second convolutional layer (315), the first pooling layer (316) and the second pooling layer (317), the residual The difference network (311), the first convolutional layer (314) and the second convolutional layer (315) are all connected to the trademark database (1), and the residual network (311) data is connected to the first A fully connected layer (312), the first convolutional layer (314) is data-connected to the first pooling layer (316), and the second convolutional layer (315) is data-connected to the second pooling layer (317), the first fully connected layer (312), the first pooling layer (316) and the second pooling layer (317) are all connected to the second fully connected layer (313), so The second fully connected layer (313) transmits the multi-scale convolutional neural network features to the feature retrieval module (2). 3.一种基于多维度特征融合的商标近似检索方法,其特征在于:包括以下步骤:3. a kind of trademark approximate retrieval method based on multi-dimensional feature fusion, it is characterized in that: comprise the following steps: S1、建立所述商标数据库(1);S1, establish the said trademark database (1); S2、通过所述特征检索模块(2)输入待注册商标;S2, input the trademark to be registered through the feature retrieval module (2); S3、根据所述待注册商标,所述神经网络特征模块(31)通过所述商标数据库(1)提取所述商标图像的多尺度卷积神经网络特征并传输至所述特征检索模块(2),所述视觉词袋特征模块(32)通过所述商标数据库(1)提取所述商标图像的视觉词袋特征并传输至所述特征检索模块(2),所述图形要素模块(33)通过所述商标数据库(1)提取所述商标图像的图形要素特征并传输至所述特征检索模块(2);S3. According to the trademark to be registered, the neural network feature module (31) extracts the multi-scale convolutional neural network feature of the trademark image through the trademark database (1) and transmits it to the feature retrieval module (2) , the visual word bag feature module (32) extracts the visual word bag feature of the trademark image through the trademark database (1) and transmits it to the feature retrieval module (2), and the graphic element module (33) through the The trademark database (1) extracts the graphic element feature of the trademark image and transmits it to the feature retrieval module (2); S4、基于所述多尺度卷积神经网络特征、所述视觉词袋特征和所述图形要素特征,将待注册商标与已注册商标库中商标进行相似度匹配,得到三个维度特征的融合检索结果。S4. Based on the feature of the multi-scale convolutional neural network, the feature of the visual word bag and the feature of the graphic elements, the similarity between the trademark to be registered and the trademark in the registered trademark database is matched to obtain a fusion retrieval of three-dimensional features result. 4.根据权利要求3所述的一种基于多维度特征融合的商标近似检索方法,其特征在于:步骤S4中融合检索结果的具体计算公式为:4. a kind of trademark approximate retrieval method based on multi-dimensional feature fusion according to claim 3, is characterized in that: the concrete calculation formula of fusion retrieval result in step S4 is: Score(a,di)Score(a, d i ) =α*ScoreCNN(a,di)+β*ScoreBoVW(a,di)+Scoreelement(a,di)=α*Score CNN (a, d i )+β*Score BoVW (a, d i )+Score element (a, d i ) 其中Score(a,di)为待注册商标a和已注册商标库中第i个商标di的总相似度,ScoreCNN为多尺度卷积神经网络特征的相似度、ScoreBoVW为视觉词袋特征的相似度、ScOreelement为图形要素特征的相似度,α和β为权重参数。Among them, Score(a, d i ) is the total similarity between the trademark a to be registered and the i-th trademark d i in the registered trademark library, Score CNN is the similarity of multi-scale convolutional neural network features, and Score BoVW is the visual word bag The similarity of features and Score element are the similarity of graphic element features, and α and β are weight parameters. 5.根据权利要求3所述的一种基于多维度特征融合的商标近似检索方法,其特征在于:步骤S3中所述神经网络特征模块(31)提取所述商标图像的多尺度卷积神经网络特征并传输至所述特征检索模块(2)具体方式为:5. A kind of trademark approximate retrieval method based on multi-dimensional feature fusion according to claim 3, it is characterized in that: the multi-scale convolutional neural network of the described trademark image extracted by the neural network feature module (31) in step S3 The specific method is as follows: S311、商标图像分别输入所述残差网络(311)、所述第一卷积层(314)和所述第二卷积层(315),提取所述残差网络(311)中的平均池化层通过所述第一全连接层(312),得到第一特征;S311, the trademark image is respectively input to the residual network (311), the first convolutional layer (314) and the second convolutional layer (315), and the average pool in the residual network (311) is extracted The chemical layer passes through the first fully connected layer (312) to obtain the first feature; S312、所述第一卷积层(314)和所述第二卷积层(315)采用不同的卷积操作参数,分别通过第一池化层(316)和所述第二池化层(317)得到第二特征和第三特征;S312, the first convolution layer (314) and the second convolution layer (315) adopt different convolution operation parameters, and the first pooling layer (316) and the second pooling layer ( 317) obtain the second feature and the third feature; S313、所述第一特征、所述第二特征和所述第三特征均经过L2正则化;S313, the first feature, the second feature and the third feature are all subject to L 2 regularization; S314、正则化后的所述第一特征、所述第二特征和所述第三特征拼接输入所述第二全连接层(313);S314, the regularized first feature, the second feature and the third feature are spliced and input into the second fully connected layer (313); S315、所述第二全连接层(313)通过线性映射得到所述多尺度卷积神经网络特征。S315, the second fully connected layer (313) obtains the multi-scale convolutional neural network feature through linear mapping. 6.根据权利要求5所述的一种基于多维度特征融合的商标近似检索方法,其特征在于:所述步骤S3中所述神经网络特征模块(31)提取所述商标图像的神经网络特征并传输至所述特征检索模块(2)还包括:6. The method for approximating trademark retrieval based on multi-dimensional feature fusion according to claim 5, wherein the neural network feature module (31) in the step S3 extracts the neural network feature of the trademark image and extracts the The transmission to the feature retrieval module (2) also includes: S316、通过三元组度量损失函数对多尺度卷积神经网络参数进行优化,挖掘并学习已注册商标图像的语义信息;S316, optimizing the parameters of the multi-scale convolutional neural network through the triple metric loss function, and mining and learning the semantic information of the registered trademark image; 所述损失函数表示为:The loss function is expressed as: Loss(p,q,r)Loss(p,q,r) =max(0,margin+D(f(q),f(p))-D(f(q),D(f(n))))=max(0, margin+D(f(q), f(p))-D(f(q), D(f(n)))) 其中,Loss(p,q,r)为三元组(p,q,r)的度量损失函数,p,q,r分别为当前商标、当前商标的近似商标和当前商标的非近似商标,f(·)为多尺度卷积神经网络的输出特征,D(·)为两个输入特征之间的余弦距离,margin为边界参数。Among them, Loss(p, q, r) is the metric loss function of the triplet (p, q, r), p, q, r are the current trademark, the similar trademark of the current trademark and the non-similar trademark of the current trademark, f ( ) is the output feature of the multi-scale convolutional neural network, D( ) is the cosine distance between the two input features, and margin is the boundary parameter. 7.根据权利要求3所述的一种基于多维度特征融合的商标近似检索方法,其特征在于:所述步骤S3中所述视觉词袋特征模块(32)提取所述商标图像的视觉词袋特征并传输至所述特征检索模块(2)的具体方式为:7. A kind of trademark approximate retrieval method based on multi-dimensional feature fusion according to claim 3, characterized in that: the visual word bag feature module (32) in the step S3 extracts the visual word bag of the trademark image The specific way to transmit the feature to the feature retrieval module (2) is as follows: S321、使用Harris、Hessian、Kaze三种局部检测子提取图像的关键点;S321. Use Harris, Hessian, and Kaze to extract the key points of the image; S322、通过Sift描述子对图像关键点提取关键点特征;S322, extracting key point features from image key points through the Sift descriptor; S323、关键点特征通过Kmeans聚类建立视觉词典;S323, the key point feature establishes a visual dictionary through Kmeans clustering; S324、通过视觉词典对图像提取视觉词袋特征。S324. Extract visual word bag features from the image through a visual dictionary.
CN202011046201.7A 2020-09-29 2020-09-29 A trademark similarity retrieval system and method based on multi-dimensional feature fusion Active CN112182275B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011046201.7A CN112182275B (en) 2020-09-29 2020-09-29 A trademark similarity retrieval system and method based on multi-dimensional feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011046201.7A CN112182275B (en) 2020-09-29 2020-09-29 A trademark similarity retrieval system and method based on multi-dimensional feature fusion

Publications (2)

Publication Number Publication Date
CN112182275A true CN112182275A (en) 2021-01-05
CN112182275B CN112182275B (en) 2024-12-27

Family

ID=73945705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011046201.7A Active CN112182275B (en) 2020-09-29 2020-09-29 A trademark similarity retrieval system and method based on multi-dimensional feature fusion

Country Status (1)

Country Link
CN (1) CN112182275B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884005A (en) * 2021-01-21 2021-06-01 汉唐信通(北京)科技有限公司 Image retrieval method and device based on SPTAG and convolutional neural network
CN113326926A (en) * 2021-06-30 2021-08-31 上海理工大学 Fully-connected Hash neural network for remote sensing image retrieval
CN114003790A (en) * 2021-12-30 2022-02-01 北京企名片科技有限公司 Data processing method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126581A (en) * 2016-06-20 2016-11-16 复旦大学 Cartographical sketching image search method based on degree of depth study
CN106919920A (en) * 2017-03-06 2017-07-04 重庆邮电大学 Scene recognition method based on convolution feature and spatial vision bag of words
CN109273054A (en) * 2018-08-31 2019-01-25 南京农业大学 Protein subcellular interval prediction method based on relational map
CN110347853A (en) * 2019-07-09 2019-10-18 成都澳海川科技有限公司 A kind of image hash code generation method based on Recognition with Recurrent Neural Network
CN110852327A (en) * 2019-11-07 2020-02-28 首都师范大学 Image processing method, device, electronic device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126581A (en) * 2016-06-20 2016-11-16 复旦大学 Cartographical sketching image search method based on degree of depth study
CN106919920A (en) * 2017-03-06 2017-07-04 重庆邮电大学 Scene recognition method based on convolution feature and spatial vision bag of words
CN109273054A (en) * 2018-08-31 2019-01-25 南京农业大学 Protein subcellular interval prediction method based on relational map
CN110347853A (en) * 2019-07-09 2019-10-18 成都澳海川科技有限公司 A kind of image hash code generation method based on Recognition with Recurrent Neural Network
CN110852327A (en) * 2019-11-07 2020-02-28 首都师范大学 Image processing method, device, electronic device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘东: "基于空间共生词袋模型与卷积神经网络的医学影像分类方法", 湘南学院学报, 25 April 2020 (2020-04-25), pages 1 - 3 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112884005A (en) * 2021-01-21 2021-06-01 汉唐信通(北京)科技有限公司 Image retrieval method and device based on SPTAG and convolutional neural network
CN113326926A (en) * 2021-06-30 2021-08-31 上海理工大学 Fully-connected Hash neural network for remote sensing image retrieval
CN114003790A (en) * 2021-12-30 2022-02-01 北京企名片科技有限公司 Data processing method

Also Published As

Publication number Publication date
CN112182275B (en) 2024-12-27

Similar Documents

Publication Publication Date Title
CN109522553B (en) Named entity identification method and device
CN108875074B (en) Answer selection method and device based on cross attention neural network and electronic equipment
CN106295796B (en) entity link method based on deep learning
CN112100346B (en) A visual question answering method based on the fusion of fine-grained image features and external knowledge
CN110222140A (en) A kind of cross-module state search method based on confrontation study and asymmetric Hash
CN110609891A (en) A Visual Dialogue Generation Method Based on Context-Aware Graph Neural Network
CN116402063B (en) Multimodal satire recognition method, device, equipment and storage medium
CN113535949B (en) Multi-modal combined event detection method based on pictures and sentences
CN113177141A (en) Multi-label video hash retrieval method and device based on semantic embedded soft similarity
CN112182275B (en) A trademark similarity retrieval system and method based on multi-dimensional feature fusion
CN110222560A (en) A kind of text people search's method being embedded in similitude loss function
CN114863194B (en) Scientific and technological information-oriented cross-media retrieval feature mapping network training method, retrieval method and device
CN118312600B (en) Intelligent customer service question-answering method based on knowledge graph and large language model
CN113537304A (en) Cross-modal semantic clustering method based on bidirectional CNN
CN115658934A (en) Image-text cross-modal retrieval method based on multi-class attention mechanism
CN113239159A (en) Cross-modal retrieval method of videos and texts based on relational inference network
CN109284414A (en) Method and system for cross-modal content retrieval based on semantic preservation
CN117333908A (en) Cross-modal pedestrian re-identification method based on posture feature alignment
Dong et al. Cross-media similarity evaluation for web image retrieval in the wild
CN114239612A (en) A kind of multimodal neural machine translation method, computer equipment and storage medium
CN116955707A (en) Content tag determination method, device, equipment, medium and program product
CN117787400A (en) A multi-modal knowledge graph construction method based on dual-stream coding and contrastive learning
CN114239730A (en) A Cross-modal Retrieval Method Based on Neighbor Ranking Relation
CN115292533A (en) Cross-modal pedestrian retrieval method driven by visual positioning
CN117743614B (en) Remote sensing image text retrieval method based on remote sensing multimodal basic model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant