CN114461827A - Method and device for searching picture by picture - Google Patents

Method and device for searching picture by picture Download PDF

Info

Publication number
CN114461827A
CN114461827A CN202210115546.6A CN202210115546A CN114461827A CN 114461827 A CN114461827 A CN 114461827A CN 202210115546 A CN202210115546 A CN 202210115546A CN 114461827 A CN114461827 A CN 114461827A
Authority
CN
China
Prior art keywords
features
image
dictionary
convolution
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210115546.6A
Other languages
Chinese (zh)
Inventor
朱利霞
伊文超
李明明
潘心冰
何彬彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202210115546.6A priority Critical patent/CN114461827A/en
Publication of CN114461827A publication Critical patent/CN114461827A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of image processing, and particularly provides a method for searching images by using images, which comprises the following steps: s1, extracting image features and attention weight information; s2, fusing the obtained characteristics of the three different layers and the attention weight information; s3, carrying out feature clustering and constructing a dictionary tree; s4, performing reverse indexing according to the dictionary tree to obtain a dictionary vector of the image; and S5, calculating the similarity according to the dictionary vector, sequencing and outputting the result of searching the images. Compared with the prior art, the method compresses the features based on the dictionary tree, generates the vector with fixed dimensionality according to the constructed dictionary tree for the image features, then stores and calculates the similarity, can reduce the space required by storage and improve the speed of similarity calculation, and further accurately and quickly finishes the task of searching the image by the image.

Description

Method and device for searching picture by picture
Technical Field
The invention relates to the technical field of image processing, and particularly provides a method and a device for searching images by using images.
Background
With the rapid development of social networks and e-commerce networks, image data is growing at an alarming rate every day, and a huge image database is formed. The image database contains rich information, and the images required by the user are retrieved from the mass image database, so that the method is a research direction with great potential for application requirements and prospect development in the field of image processing at present. At present, some famous search engines, such as google and Baidu, are provided with image searching services, so that users can conveniently acquire more image data according to requirements.
The image searching technology is a branch of the image retrieval field. Conventional image retrieval techniques fall into two main categories: the image retrieval field based on text and the image retrieval technology based on image content semantics. The text-based image retrieval field mainly carries out image retrieval according to text description, text annotation paraphrasing and the like are required to be carried out on image data, and tasks which can not be completed in a mass image database are almost possible; the image retrieval technology based on image content semantics mainly utilizes information such as color and texture of an image, object types contained in the image and the like to compare image information.
In the actual image retrieval process, complex background noise directly affects the final search performance, so many algorithms use a region candidate network (RPN) to extract a region of interest, and then perform feature extraction and further perform similarity comparison. However, the generation and setting of the anchor frame on one hand needs a lot of work by using the RPN network to extract the region of interest, and on the other hand, the candidate region selected by the frame is large and may lose image foreground information.
In the image retrieval technology based on deep learning, the similarity calculation is generally performed by using the features of the last layer of a deep neural network, so that the retrieval task is completed. But high-level features lose much detail. Therefore, the fusion of the high-level features and the low-level features can efficiently and reasonably utilize the image feature information. The image retrieval technology based on deep learning generally comprises the following steps: firstly, extracting image features by using a convolutional neural network to obtain the features of a characteristic image, performing distance calculation on the image features by using a metric learning method such as Euclidean distance, and then sequencing according to the distances to obtain similar image information. However, the feature dimensionality obtained based on the neural network is high, a large amount of storage space is occupied when feature storage is carried out, and time is consumed when similarity is calculated.
Searching images by using images is a technology for searching similar images based on images in the field of image searching, and the current algorithm is the derivation improvement of the image searching technology based on image content semantics. The most key part of the image searching technology is image feature extraction and expression, and the problems encountered at present are that global features comprise a lot of background noise, local features lose key information, and the retrieval accuracy is low and the retrieval speed is slow. The use of both global and local features will greatly increase the feature dimensionality, thereby increasing the cost of subsequent computations.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method for searching a picture by using a picture, which has strong practicability.
The invention further aims to provide a device which is reasonable in design, safe and applicable and searches the picture by using the picture.
The technical scheme adopted by the invention for solving the technical problem is as follows:
a method for searching a picture by a picture comprises the following steps:
s1, extracting image features and attention weight information;
s2, fusing the obtained characteristics of the three different layers and the attention weight information;
s3, carrying out feature clustering and constructing a dictionary tree;
s4, performing reverse indexing according to the dictionary tree to obtain a dictionary vector of the image;
and S5, calculating the similarity according to the dictionary vector, sequencing and outputting the result of searching the images.
Further, in step S1, the image I1 is fixed to 224 × 3, and is input to the VGG16 convolution model, and convolution feature maps of three different layers are extracted, where the convolution feature map of the first layer is extracted at layers 3 to 5 of the VGG16 convolution model, the convolution feature map of the second layer is extracted at layers 7 to 9 of the VGG16 convolution model, and the convolution feature map of the third layer is extracted at layers 10 to 13 of the VGG16 convolution model.
Further, in step S1, the method further includes:
s11, extracting a 4 th layer convolution feature map of VGG16, wherein the size of the feature map is 112 × 128, and the feature map is used
Figure BDA0003496191570000031
Represents; extracting 7 th layer convolution characteristic diagram of VGG16, wherein the size of the characteristic diagram is 56 × 256, and using
Figure BDA0003496191570000032
Represents; extracting 13 th layer convolution characteristic diagram of VGG16, wherein the size of the characteristic diagram is 14 × 512, and using
Figure BDA0003496191570000033
Represents;
s12, extracting attention weight information:
Figure BDA0003496191570000034
by the same token can obtain
Figure BDA0003496191570000035
Attention weight information.
Further, in step S2, the method further includes:
s21, merging the attention weight information into the corresponding features to obtain features containing attention weight:
Figure BDA0003496191570000036
Figure BDA0003496191570000037
Figure BDA0003496191570000038
s22, fusing the three characteristics with attention weight in different scales;
will be characterized by
Figure BDA0003496191570000039
Features of size 56 x 512 are obtained by upsampling and are compared with
Figure BDA00034961915700000310
Features of size 56 × 512 obtained by convolution block of 1 × 1 are spliced:
Figure BDA0003496191570000041
wherein the content of the first and second substances,
Figure BDA0003496191570000042
characteristic of expression pair
Figure BDA0003496191570000043
The up-sampling is carried out and,
Figure BDA0003496191570000044
characteristic of expression pair
Figure BDA0003496191570000045
A convolution operation with a convolution kernel size of 1 x 1 is performed,
Figure BDA0003496191570000046
representation feature
Figure BDA0003496191570000047
And features of
Figure BDA0003496191570000048
The result of splicing;
s23 splicingThe latter features are then upsampled to obtain features of size 112 x 512, and the upsampled features are compared with the corresponding features
Figure BDA0003496191570000049
Splicing features with the size of 112 × 512 obtained by the features through a convolution block of 1 × 1 to finally obtain fused features
Figure BDA00034961915700000410
Figure BDA00034961915700000411
Wherein the content of the first and second substances,
Figure BDA00034961915700000412
representation feature
Figure BDA00034961915700000413
And features of
Figure BDA00034961915700000414
As a result of the splicing, the result,
Figure BDA00034961915700000415
characteristic of expression pair
Figure BDA00034961915700000416
The up-sampling is carried out and,
Figure BDA00034961915700000417
characteristic of expression pair
Figure BDA00034961915700000418
Performing convolution operation with convolution kernel size of 1 x 1;
the feature information of all the images in the image database can be extracted through step S2.
Further, in step S3, the method further includes:
s31, carrying out self-adaptive K-Means hierarchical clustering on the obtained characteristics,constructing a dictionary tree, setting the clustering node of each layer in the dictionary tree to be n at most and the dictionary tree to be m at most, firstly clustering the characteristics to obtain p clustering centers which are respectively { mu [ ]1,…,μp}; then clustering is carried out under each clustering center until the whole clustering process reaches the preset number of layers, clustering is finished, and the construction of the dictionary tree is completed until the clustering centers are the nodes of the dictionary tree;
s32, calculating the node weight according to the number of the covering features of each node in the dictionary number, wherein the node weight calculation formula is as follows:
WT=log(N/NT)
where N represents the total number of images in the image library, NTRepresenting the number of images for the features covered in the dictionary tree node T, and then taking the logarithm of the ratio of the two to obtain the weight of the node T.
Further, in step S4, a dictionary vector of the image in the image library is calculated according to the dictionary tree information and stored, the image is processed through step S1, step S2 to obtain the original features, and a dictionary tree is obtained according to step S3, the dictionary vector of the database image is calculated, and the feature compression expression is completed.
Further, for image I1The frequency of occurrence of the characteristic in the dictionary tree node is counted in step S3
Figure BDA0003496191570000056
The dictionary vector is calculated according to:
Figure BDA0003496191570000051
wherein WTThe weights of the nodes in the dictionary tree are represented,
Figure BDA0003496191570000052
representing an image I1According to the dictionary vector obtained by the dictionary tree, the dictionary vectors of all images in the database can be calculated according to the method, the image features are characterized by the features, and the features are stored in the image feature library.
Further, in step S5, a dictionary vector, denoted as d, of the image to be retrieved is calculatedqueryCalculating a dictionary vector d of the band search image according to the following formulaquerySimilarity to image dictionary vectors stored in the database:
Figure BDA0003496191570000053
wherein p represents a dictionary vector
Figure BDA0003496191570000054
Sum vector dqueryThe dimension (c) of (a) is,
Figure BDA0003496191570000055
a dictionary vector representing the jth image in the image library,
then, the images are sorted according to the similarity, and the top N images with high similarity are returned as results.
An apparatus for searching a picture with a picture, comprising: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine readable program to perform a method for searching a graph.
Compared with the prior art, the method and the device for searching the picture by the picture have the following outstanding advantages that:
the invention uses the convolution network to extract the features of different layers and different scales, gradually fuses the features and fuses the attention weight information, so that the network not only focuses on the global feature information but also can increase the expression capability of the significant features in the training process, and the subsequent calculation cost is reduced.
The features are compressed by adopting a dictionary tree-based mode, the image features generate vectors with fixed dimensionality according to the constructed dictionary tree, then the vectors are stored and the similarity is calculated, the space required by storage can be reduced, the similarity calculation speed is improved, and further the task of searching the images by the images can be accurately and quickly completed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a method for searching pictures.
Detailed Description
The present invention will be described in further detail with reference to specific embodiments in order to better understand the technical solutions of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
A preferred embodiment is given below:
as shown in fig. 1, in the method for searching a graph in this embodiment, first, the image is subjected to a convolution operation of a VGG16 network, and image features of different layers and self-attention weight information of the image features are extracted. Secondly, the obtained features of the three different layers and the attention weight information are fused, the resolution of the features of the lower layer is higher, more position and detail information is contained, the semantic property is lower, and the noise is more. The high-level features have stronger semantic information, but have lower resolution and poorer detail perception capability. The three different levels of characteristics are fused to enrich and strengthen the expression capability of the characteristics.
Clustering the features to construct a dictionary tree, and calculating the node weight according to the dictionary tree; and performing reverse indexing to calculate a dictionary vector of the image in the database and storing the dictionary vector.
And finally, calculating the similarity and sequencing, and further outputting an image retrieval result. And extracting the features of the image to be inquired, performing reverse indexing according to the constructed dictionary tree to obtain an image vector, calculating the similarity between the image vector to be searched and the image vector in the image database, and sequencing to obtain a search result.
The specific operation orientations are as follows:
s1, extracting image features and attention weight information:
image I1And fixing the feature map to 224 × 3, inputting the feature map into a VGG16 convolution model, and extracting convolution feature maps of three different layers, wherein the convolution feature map of the first layer is extracted at layers 3 to 5 of the VGG16 convolution model, the convolution feature map of the second layer is extracted at layers 7 to 9 of the VGG16 convolution model, and the convolution feature map of the third layer is extracted at layers 10 to 13 of the VGG16 convolution model.
Further comprising the following steps:
s11, image I1The size is fixed to 224 × 3 and input into a VGG16 convolution model, and 4 th layer convolution feature maps of VGG16 are respectively extracted, wherein the feature maps have the size of 112 × 128 and are used for extracting
Figure BDA0003496191570000071
Represents; extracting 7 th layer convolution characteristic map of VGG16, wherein the size of the characteristic map is 56 × 256, and the characteristic map is used for
Figure BDA0003496191570000081
Represents; extracting 13 th layer convolution characteristic diagram of VGG16, wherein the size of the characteristic diagram is 14 × 512, and using
Figure BDA0003496191570000082
And (4) showing.
And S12, extracting attention weight information of the features. In the convolutional layer, output characteristics are obtained through linear combination of convolutional kernels and original characteristics, the convolutional kernels are local, the receptive field is limited, and the attention mechanism obtains larger receptive field and context information by capturing global information. Extracting attention weight information:
Figure BDA0003496191570000083
by the same token can obtain
Figure BDA0003496191570000084
Attention weight information.
S2, fusing the obtained characteristics of the three different layers and the attention weight information;
further comprising the following steps:
and S21, integrating the attention weight information into the corresponding features to obtain the features containing the attention weight:
Figure BDA0003496191570000085
Figure BDA0003496191570000086
Figure BDA0003496191570000087
and S22, fusing the features of the three different scales. Will be characterized by
Figure BDA0003496191570000088
Features of size 56 x 512 are obtained by upsampling and are compared with
Figure BDA0003496191570000089
Features of size 56 × 512 obtained by convolution block of 1 × 1 are spliced:
Figure BDA00034961915700000810
wherein
Figure BDA00034961915700000811
Characteristic of expression pair
Figure BDA00034961915700000812
The up-sampling is carried out and,
Figure BDA00034961915700000813
characteristic of expression pair
Figure BDA00034961915700000814
A convolution operation with a convolution kernel size of 1 x 1 is performed,
Figure BDA00034961915700000815
representation feature
Figure BDA00034961915700000816
And features of
Figure BDA00034961915700000817
And (5) splicing results.
S23, the spliced features are up-sampled to obtain features 112 x 512, and the features are combined with the features
Figure BDA0003496191570000091
Splicing features with the size of 112 × 512 obtained by the features through a convolution block of 1 × 1 to finally obtain fused features
Figure BDA0003496191570000092
Figure BDA0003496191570000093
Wherein the content of the first and second substances,
Figure BDA0003496191570000094
representation feature
Figure BDA0003496191570000095
And characteristics of
Figure BDA0003496191570000096
As a result of the splicing, the result,
Figure BDA0003496191570000097
characteristic of expression pair
Figure BDA0003496191570000098
The up-sampling is carried out and,
Figure BDA0003496191570000099
characteristic of expression pair
Figure BDA00034961915700000910
A convolution operation with a convolution kernel size of 1 x 1 is performed. Through the two steps, the characteristic information of all images in the image database can be extracted, and the next step is carried out.
S3, carrying out feature clustering and constructing a dictionary tree;
the constructed dictionary tree is adaptively constructed according to a specific image database,
further comprising:
and S31, carrying out self-adaptive K-Means hierarchical clustering on the features acquired in the steps and constructing a dictionary tree. Here, the clustering node of each layer in the dictionary tree is set to be at most n, and the dictionary tree is set to be at most m layers. Firstly, clustering the features to obtain p clustering centers, wherein the p clustering centers are respectively { mu1,...,μp}. And then clustering is carried out under each clustering center until the whole clustering process reaches the preset number of layers, clustering is finished, and the construction of the dictionary tree is completed until the clustering centers are the nodes of the dictionary tree.
And S32, calculating the node weight according to the number of the covering features of each node in the dictionary number. The node weight calculation formula is as follows:
WT=log(N/NT) (7)
where N represents the total number of images in the image library, NTRepresenting the number of images for the features covered in the dictionary tree node T, and then taking the logarithm of the ratio of the two to obtain the weight of the node T.
S4, performing reverse indexing according to the dictionary tree to obtain a dictionary vector of the image;
and calculating and storing dictionary vectors of the images in the image library according to the dictionary tree information. The image is processed by the steps S1 and S2 to obtain original features, a dictionary tree is obtained according to the step S3, a dictionary vector of the database image is calculated, and feature compression expression is completed. For image I1Counting the frequency of the appearance of the characteristics in the nodes of the dictionary tree
Figure BDA0003496191570000101
The dictionary vector is calculated according to:
Figure BDA0003496191570000102
wherein WTThe weights of the nodes in the dictionary tree are represented,
Figure BDA0003496191570000103
representing an image I1And obtaining a dictionary vector according to the dictionary tree. According to the method, dictionary vectors of all images in the database can be calculated, the features are characterized and stored in the image feature library.
S5, calculating the similarity according to the dictionary vector, sequencing and further outputting the result of searching the images by the images:
calculating dictionary vector of image to be retrieved, and recording as dqueryCalculating a dictionary vector d of the band search image according to the following formulaquerySimilarity to image dictionary vectors stored in the database:
Figure BDA0003496191570000104
wherein p represents a dictionary vector
Figure BDA0003496191570000105
Sum vector dqueryThe dimension (c) of (a) is,
Figure BDA0003496191570000106
a dictionary vector representing the jth image in the image library.
Then, the images are sorted according to the similarity, and the top N images with high similarity are returned as results.
In this embodiment, an apparatus for searching a picture with a picture includes: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine readable program to perform a method for searching a graph.
The above embodiments are only specific examples, and the scope of the present invention includes but is not limited to the above embodiments, and any suitable changes or substitutions that are consistent with the claims of the method and apparatus for searching drawings and that are made by one of ordinary skill in the art should fall within the scope of the present invention.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (9)

1. A method for searching a picture by a picture is characterized by comprising the following steps:
s1, extracting image features and attention weight information;
s2, fusing the obtained characteristics of the three different layers and the attention weight information;
s3, carrying out feature clustering and constructing a dictionary tree;
s4, performing reverse indexing according to the dictionary tree to obtain a dictionary vector of the image;
and S5, calculating the similarity according to the dictionary vector, sequencing and outputting the result of searching the images.
2. A method of searching a graph according to claim 1Method, characterized in that in step S1, image I is processed1And fixing the feature map to 224 × 3, inputting the feature map into a VGG16 convolution model, and extracting convolution feature maps of three different layers, wherein the convolution feature map of the first layer is extracted at layers 3 to 5 of the VGG16 convolution model, the convolution feature map of the second layer is extracted at layers 7 to 9 of the VGG16 convolution model, and the convolution feature map of the third layer is extracted at layers 10 to 13 of the VGG16 convolution model.
3. A method for searching a graph according to claim 2, further comprising, in step S1:
s11, extracting a 4 th layer convolution feature map of VGG16, wherein the size of the feature map is 112 × 128, and the feature map is used
Figure FDA0003496191560000011
Represents; extracting 7 th layer convolution characteristic diagram of VGG16, wherein the size of the characteristic diagram is 56 × 256, and using
Figure FDA0003496191560000012
Represents; extracting 13 th layer convolution characteristic diagram of VGG16, wherein the size of the characteristic diagram is 14 × 512, and using
Figure FDA0003496191560000013
Represents;
s12, extracting attention weight information:
Figure FDA0003496191560000014
by the same token, can obtain
Figure FDA0003496191560000015
Attention weight information.
4. A method for searching a graph according to claim 3, wherein in step S2, the method further comprises:
s21, merging the attention weight information into the corresponding features to obtain features containing attention weight:
Figure FDA0003496191560000021
Figure FDA0003496191560000022
Figure FDA0003496191560000023
s22, fusing the three characteristics with attention weight in different scales;
will be characterized by
Figure FDA0003496191560000024
The feature size 56 x 512 is obtained by upsampling and is compared with
Figure FDA0003496191560000025
Features of size 56 × 512 obtained by convolution block of 1 × 1 are spliced:
Figure FDA0003496191560000026
wherein the content of the first and second substances,
Figure FDA0003496191560000027
characteristic of expression pair
Figure FDA0003496191560000028
The up-sampling is carried out and,
Figure FDA0003496191560000029
characteristic of expression pair
Figure FDA00034961915600000210
A convolution operation with a convolution kernel size of 1 x 1 is performed,
Figure FDA00034961915600000211
representation of features
Figure FDA00034961915600000212
And features of
Figure FDA00034961915600000213
The result of splicing;
s23, splicing the features, upsampling to obtain 112 × 512 features, and mixing with the above two
Figure FDA00034961915600000214
Splicing features with the size of 112 × 512 obtained by the features through a convolution block of 1 × 1 to finally obtain fused features
Figure FDA00034961915600000215
Figure FDA00034961915600000216
Wherein the content of the first and second substances,
Figure FDA00034961915600000217
representation feature
Figure FDA00034961915600000218
And features of
Figure FDA00034961915600000219
As a result of the splicing, the result,
Figure FDA00034961915600000220
characteristic of expression pair
Figure FDA00034961915600000221
The up-sampling is carried out and,
Figure FDA00034961915600000222
characteristic of a representation pair
Figure FDA00034961915600000223
Performing convolution operation with convolution kernel size of 1 x 1;
the feature information of all the images in the image database can be extracted through step S2.
5. The method for searching with a graph according to claim 4, wherein in step S3, further comprising:
s31, conducting self-adaptive K-Means hierarchical clustering on the obtained features, constructing a dictionary tree, setting the clustering node of each layer in the dictionary tree to be n at most and the dictionary tree to be m at most, firstly clustering the features to obtain p clustering centers respectively being { mu [ ]1,...,μp}; then clustering is carried out under each clustering center until the whole clustering process reaches the preset number of layers, clustering is finished, and the construction of the dictionary tree is completed until the clustering centers are the nodes of the dictionary tree;
s32, calculating the node weight according to the number of the covering features of each node in the dictionary number, wherein the node weight calculation formula is as follows:
WT=log(N/NT)
where N represents the total number of images in the image library, NTRepresenting the number of images for the features covered in the dictionary tree node T, and then taking the logarithm of the ratio of the two to obtain the weight of the node T.
6. The method of claim 5, wherein in step S4, dictionary vectors of images in the image library are calculated according to the dictionary tree information and stored, the images are processed through step S1, step S2 to obtain original features, and the dictionary trees are obtained according to step S3 to calculate dictionary vectors of database images, thereby completing feature compression expression.
7. A method as claimed in claim 6, characterized in that for image I1The frequency of occurrence of the characteristic in the dictionary tree node is counted in step S3
Figure FDA0003496191560000033
The dictionary vector is calculated according to the following formula:
Figure FDA0003496191560000031
wherein WTThe weights of the nodes in the dictionary tree are represented,
Figure FDA0003496191560000032
representing an image I1According to the dictionary vector obtained by the dictionary tree, the dictionary vectors of all images in the database can be calculated according to the method, the image features are characterized by the features, and the features are stored in the image feature library.
8. The method as claimed in claim 6, wherein in step S5, a dictionary vector of the image to be retrieved is calculated and recorded as dqueryCalculating a dictionary vector d of the retrieved image according to the following formulaquerySimilarity to image dictionary vectors stored in the database:
Figure FDA0003496191560000041
wherein p represents a dictionary vector
Figure FDA0003496191560000042
Sum vector dqueryThe dimension (c) of (a) is,
Figure FDA0003496191560000043
a lexicon vector representing the jth image in the image library,
then, the images are sorted according to the similarity, and the top N images with high similarity are returned as results.
9. An apparatus for searching a graph with a graph, comprising: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor, configured to invoke the machine readable program, to perform the method of any of claims 1 to 8.
CN202210115546.6A 2022-02-07 2022-02-07 Method and device for searching picture by picture Pending CN114461827A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210115546.6A CN114461827A (en) 2022-02-07 2022-02-07 Method and device for searching picture by picture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210115546.6A CN114461827A (en) 2022-02-07 2022-02-07 Method and device for searching picture by picture

Publications (1)

Publication Number Publication Date
CN114461827A true CN114461827A (en) 2022-05-10

Family

ID=81411487

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210115546.6A Pending CN114461827A (en) 2022-02-07 2022-02-07 Method and device for searching picture by picture

Country Status (1)

Country Link
CN (1) CN114461827A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662588A (en) * 2023-08-01 2023-08-29 山东省大数据中心 Intelligent searching method and system for mass data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662588A (en) * 2023-08-01 2023-08-29 山东省大数据中心 Intelligent searching method and system for mass data
CN116662588B (en) * 2023-08-01 2023-10-10 山东省大数据中心 Intelligent searching method and system for mass data

Similar Documents

Publication Publication Date Title
WO2022116537A1 (en) News recommendation method and apparatus, and electronic device and storage medium
JP6278893B2 (en) Interactive multi-mode image search
Wang et al. Retrieval topic recurrent memory network for remote sensing image captioning
CN108520046B (en) Method and device for searching chat records
US20120114248A1 (en) Hierarchical Sparse Representation For Image Retrieval
CN107679082A (en) Question and answer searching method, device and electronic equipment
CN106776849A (en) A kind of method and guide system to scheme quick-searching sight spot
CN107291845A (en) A kind of film based on trailer recommends method and system
CN107291825A (en) With the search method and system of money commodity in a kind of video
CN109308324A (en) A kind of image search method and system based on hand drawing style recommendation
CN110147494A (en) Information search method, device, storage medium and electronic equipment
CN114547257B (en) Class matching method and device, computer equipment and storage medium
CN113515589A (en) Data recommendation method, device, equipment and medium
CN107391599B (en) Image retrieval method based on style characteristics
CN116034401A (en) System and method for retrieving video using natural language descriptions
CN116975271A (en) Text relevance determining method, device, computer equipment and storage medium
CN111723571A (en) Text information auditing method and system
CN114461827A (en) Method and device for searching picture by picture
CN112883229B (en) Video-text cross-modal retrieval method and device based on multi-feature-map attention network model
CN115129829A (en) Question-answer calculation method, server and storage medium
CN112685452A (en) Enterprise case retrieval method, device, equipment and storage medium
Rao et al. Deep learning-based image retrieval system with clustering on attention-based representations
CN115129864A (en) Text classification method and device, computer equipment and storage medium
CN114443916A (en) Supply and demand matching method and system for test data
CN113191401A (en) Method and device for three-dimensional model recognition based on visual saliency sharing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination