CN107133348B - Approximate searching method based on semantic consistency in large-scale picture set - Google Patents

Approximate searching method based on semantic consistency in large-scale picture set Download PDF

Info

Publication number
CN107133348B
CN107133348B CN201710368677.4A CN201710368677A CN107133348B CN 107133348 B CN107133348 B CN 107133348B CN 201710368677 A CN201710368677 A CN 201710368677A CN 107133348 B CN107133348 B CN 107133348B
Authority
CN
China
Prior art keywords
picture
matrix
pictures
picture set
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201710368677.4A
Other languages
Chinese (zh)
Other versions
CN107133348A (en
Inventor
胡鸣珂
胡海峰
吕成钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201710368677.4A priority Critical patent/CN107133348B/en
Publication of CN107133348A publication Critical patent/CN107133348A/en
Application granted granted Critical
Publication of CN107133348B publication Critical patent/CN107133348B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information

Abstract

The invention discloses an approximate searching method based on semantic consistency in a large-scale picture set, which comprises the following steps of: introducing semantic consistency when calculating the similarity of the pictures in the picture set and the sampled pictures, and obtaining a conversion matrix required by the next stage; and (3) Hash coding process: calculating the optimized similarity between the pictures and the sampled pictures according to the conversion matrix obtained in the training process, and constructing a similarity matrix according to the optimized similarity so as to carry out binary coding on each picture in the picture set by utilizing a Hash coding technology; and then comparing the new query picture with the binary-coded Hamming distance of each picture so as to find out the neighbor of the query picture. The invention introduces the semantic consistency characteristic when measuring the similarity of the pictures, can more accurately measure the similarity between the pictures, reduces the training time of the algorithm by using a random gradient descent method, and can be effectively applied to large-scale picture data concentration.

Description

Approximate searching method based on semantic consistency in large-scale picture set
Technical Field
The invention relates to a method for approximately searching pictures in a large-scale picture data set, and belongs to the technical field of machine learning.
Background
One important application in neighbor queries is the approximate search of pictures. In the big data era, the most obvious characteristics of picture data are that the data scale is extremely large, and the characteristic dimension of a picture is very high. The method has extremely important application value for the research of the advanced subjects such as computer vision, machine learning and the like by efficiently and accurately inquiring the neighbor of massive high-dimensional pictures.
Conventional neighbor query algorithms, such as search algorithms based on tree index structures, have dimension problems. The performance of the method is rapidly reduced when the approximate neighbor search is carried out on the high-dimensional picture data, so that the method is not suitable for the current big data era. The most popular approach today is approximate neighbor search based on hashing techniques, classical approximate search hashing algorithms such as Locality Sensitive Hashing (LSH) that are solved by translating the neighbor search problem into finding similar binary codes. The approximate search algorithm based on the hash technology has a simpler index structure and less storage space. However, in order to simultaneously ensure the accuracy and the recall rate, LSH needs to construct multiple hash tables, which results in a large increase in query time and storage overhead.
A graph-based hash algorithm that can yield more efficient coding has also emerged, and can achieve better performance due to better measure of similarity between picture samples. Such as a Spectral Hash (SH) algorithm, an anchor hash (AGH) algorithm. However, these algorithms are too faceted when looking for neighboring pictures, and they only consider the actual storage location of the picture in the data set, but do not consider semantic tag information that the picture may have, thus making these algorithms less effective in picture approximation searches. In a real large-scale picture data set, many pictures have semantic tag information, and different class tag information represents that the pictures belong to different classes. For example two pictures may actually be stored a long distance apart in the data set, but they have the same class label "sky", then these two pictures are also approximate pictures. And the current popular image approximate search algorithm is often poor in performance when applied to a large-scale image data set, and the practical problem cannot be well solved.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: an approximate picture searching method based on semantic consistency applied to a large-scale picture data set is provided. The method mainly solves the approximate searching problem of the pictures and maps similar pictures into the same or similar binary codes through a Hash technology.
The invention adopts the following technical scheme for solving the technical problems:
the invention provides an approximate searching method based on semantic consistency in a large-scale picture set, which comprises the following steps:
step 1: inputting a picture set sample matrix X and a semantic class mark matrix Y corresponding to the picture set, wherein X is a matrix with dimensions of n X d, Y is a matrix with dimensions of n X c, n is the number of picture samples, d is the dimension of picture features, and c is the number of class marks;
step 2: randomly extracting a part of pictures from the picture set as a sampling picture set;
and step 3: defining a relation matrix W between pictures in the picture set and the sampling picture set, combining the relation matrix and introducing semantic consistency to construct a target function expression, iteratively solving optimization through a random gradient descent algorithm, and obtaining an optimized conversion matrix A after the expression is converged;
and 4, step 4: for each picture sample x, substituting the conversion matrix A into the relation matrix defined in the step 3 to obtain the value of each element of the relation matrix; constructing a similar matrix Z through the relation matrix, obtaining an encoding matrix by combining the similar matrix, carrying out Hash encoding on each picture in the large-scale picture data set by using the encoding matrix, and compressing and mapping the original d-dimensional features of the pictures into k-dimensional binary codes;
and 5: and for a new query picture q, calculating the binary code of the query picture through the coding matrix, comparing the Hamming distance with the binary code of each picture in the picture data set, and if the Hamming distance is smaller than a set threshold value r, considering that the two pictures are approximate pictures.
Further, the approximate search method of the present invention, the calculation process of the transformation matrix a is as follows:
step (1), defining a relation matrix W between pictures, wherein each element in the relation matrix is defined as:
Wij=exp(-||A(xi-uj)||2) (1)
in the above formula, A represents a transformation matrix, xiIndicates the ith picture in the picture set, ujRepresenting the jth picture in the sampled picture set;
step (2), defining an objective function formula as follows:
Figure BDA0001302219730000021
wherein f isiA class mark vector representing the ith picture sample, wherein the class mark vector is a c-dimensional column vector, and the values of elements in the vector are 1 or 0 respectively representingPictures belonging to and not belonging to this class, fjClass-labeled vector, | f, representing a picture in a sampled picture seti-fj||2Namely, the semantic consistency property introduced when training the transformation matrix;
step (3), optimizing the transformation matrix A according to a random gradient descent algorithm, wherein an iteration updating rule is as follows:
Figure BDA0001302219730000022
wherein, γtThe optimization step length in each iteration process is obtained, the initial value of a conversion matrix is I/delta, I is a unit matrix of d x d dimensions, and delta is the median of Euclidean distances between pictures in a picture set;
and (4) after all the picture samples in the picture set are traversed, obtaining the finally optimized conversion matrix A.
Further, the approximate search method of the present invention, step size γtSelecting one of the following values: 1*10-5,1*10-4,1*10-3Or 1 x 10-2
Further, the approximate search method of the present invention, step 4, is specifically as follows:
step a, after obtaining a conversion matrix A, calculating the optimized similarity after introducing semantic consistency between each picture and a sampled picture by using a formula (1), namely obtaining the value of each element of a relation matrix W, if m picture samples are collected in the sampled picture set, constructing a similar matrix Z by using the relation matrix,
the Z matrix calculation formula is defined as follows:
Figure BDA0001302219730000031
wherein the < i > set represents a sampled picture set, i.e. the value of the corresponding element on the Z matrix is calculated only when the picture belongs to the sampled picture set, otherwise the value of the corresponding element on the Z matrix is 0;
b, setting the number of the pictures in the sampling picture set as M, and constructing an M matrix with M x M dimensions, wherein the M matrix is defined as follows:
M=Λ-1/2ZT-1/2 (5)
wherein Λ ═ diag (Z)T1) The method is a diagonal matrix, and a k x k dimensional diagonal matrix consisting of the first k largest eigenvalues of the M matrix is obtained by calculation: sigma is diag (delta)1,...,δk)∈Rk×kAnd an m x k dimensional matrix composed of eigenvectors corresponding to the first k largest eigenvalues: v ═ V1,...,vk]∈Rm×k
And c, constructing a final coding matrix Y by using the matrixes obtained by the formula, wherein the Y matrix is defined as follows:
Figure BDA0001302219730000032
y is a matrix of n x k dimension, n represents the number of pictures in the picture set, k represents the coded digit when mapping to binary coding, each row of the coding matrix Y is a coding function, each picture obtains a vector of k dimension through the calculation of the coding function, and then the vector is subjected to binary segmentation: sgn (y), the binary code of each picture in the picture set is obtained.
Further, in the approximate search method of the present invention, r is selected from one of the following values: 1,2,3, or 4.
The key technology of the invention is as follows:
(1) approximate search algorithm based on semantic consistency
The approximate search algorithm based on semantic consistency introduces semantic consistency when calculating the similarity of each picture in a picture data set and a picture in a sampled picture set, and constructs a target function expression containing semantic information. And then, iterative solution is carried out by using a random gradient descent algorithm, and a conversion matrix reflecting the inherent semantic consistency characteristics among the pictures is obtained after the expression is converged. And mapping the pictures into k-bit binary codes by utilizing a Hash technology, and mapping similar input pictures into binary codes with similar Hamming distances.
(2) Random gradient descent (SGD) algorithm:
the random Gradient Descent algorithm is an improved algorithm of a Gradient Descent (GD) algorithm, mainly aims at the problems that the convergence speed of an original Gradient Descent algorithm is too slow and the original Gradient Descent algorithm easily falls into local optimization, and is an iterative solution method of a minimum loss function or a risk function. The invention reduces the training time of the transformation matrix in the semantic consistency approximate search method by using the stochastic gradient descent algorithm.
Compared with the prior art, the invention adopting the technical scheme has the following technical effects:
1. the problem that the convergence speed is too low by using the traditional gradient descent algorithm is solved.
2. The optimal similarity between the pictures is calculated by using the conversion matrix, and the problem that sensitive parameters are excessively dependent when the similarity of the pictures is measured by using the traditional Gaussian kernel function is solved.
3. The original picture of d dimension is compressed and mapped into binary coding of k bits by using a Hash technology, so that the efficiency of the algorithm is greatly improved, and the occupation of the memory space is greatly reduced.
Drawings
FIG. 1 is a system framework diagram of the present invention.
Fig. 2 is a flow chart of the method of the present invention.
Detailed Description
The technical scheme of the invention is further explained in detail by combining the attached drawings:
the semantic consistency characteristic is introduced when the similarity measurement is carried out on the pictures, so that the similarity between the pictures can be measured more accurately. And generating a sampling picture set to calculate the optimized similarity after semantic consistency is introduced, and reducing the training time of the algorithm by using a random gradient descent method, so that the algorithm can be effectively applied to a large-scale picture data set. And then, efficient binary codes are generated by using a Hash coding technology, so that better performance can be obtained in approximate picture searching.
As shown in fig. 1, the present invention provides a method for finding approximate pictures by comparing hamming distances between picture codes by binary coding pictures based on semantic consistency in large-scale picture sets and using a hash coding technique.
The invention mainly comprises two parts: a process of training a transformation matrix and a process of hash coding.
The process of training the conversion matrix mainly introduces semantic consistency when calculating the similarity between the pictures in the picture set and the sampled pictures and obtains the conversion matrix required by the next stage.
The Hash coding process mainly comprises the steps of calculating the optimized similarity between the pictures and the sampled pictures according to the conversion matrix obtained in the training process, constructing a similarity matrix according to the optimized similarity, and then carrying out binary coding on each picture in the picture set by utilizing the Hash coding technology. And then comparing the new query picture with the binary-coded Hamming distance of each picture so as to find out the neighbor of the query picture.
Firstly, training a transformation matrix:
the process of training the conversion matrix is mainly to establish a model according to the idea of consistent semantics and obtain the conversion matrix required in the encoding stage, and the conversion matrix reflects the inherent semantic consistency characteristic between pictures. The invention reduces training time by using a Stochastic Gradient Descent (SGD) algorithm in the process of training the transformation matrix. If the feature dimension of the picture is d dimension, the trained transformation matrix is a square matrix of d rows and d columns.
The basic idea of the approximate search method based on semantic consistency in the large-scale picture set is to map the picture from initial d-dimensional compression to k-dimensional binary coding by introducing semantic consistency characteristics. And maps similar input pictures to binary codes with similar hamming distances.
Step 1: when calculating the optimized similarity, if the picture data set contains n picture samples, defining a relation matrix W between the pictures as an n × n dimensional matrix, wherein each element in the relation matrix is defined as:
Wij=exp(-||A(xi-uj)||2) (1)
in the above formula, A represents a transformation matrix, xiNumber of representationsIth picture in data set, ujIndicating the j-th picture in the sampled picture set.
Step 2: in the process of training the transformation matrix A, semantic consistency is mainly introduced to establish an objective function, and the transformation matrix required by the encoding stage is obtained through iterative solution of a random gradient descent (SGD) algorithm. Defining the objective function as:
Figure BDA0001302219730000051
f in the above objective functioniClass flag vector representing the ith picture sample, (class flag vector is a c-dimensional column vector, c is the number of classes, and the value of an element in the vector is 1 or 0, indicating that the picture belongs to this class and does not belong to this class, respectively). f. ofjA class flag vector representing a picture in the sampled picture set. I fi-fj||2I.e., the semantically consistent nature introduced when training the transformation matrix. It can generate more accurate binary codes by combining the characteristic similarity between pictures.
And step 3: the invention employs a stochastic gradient descent algorithm in the optimization process to reduce the time taken to train the transformation matrix. The initial value of the transformation matrix is 1/delta, I is a unit matrix with dimensions d x d, and delta is the median of Euclidean distances between pictures in the data set. Then, optimizing the transformation matrix A according to a random gradient descent algorithm, wherein the iteration updating rule is as follows:
Figure BDA0001302219730000061
wherein gamma istThe step length is optimized in each iteration process, and the step length can be selected from the following values: (1*10-5,1*10-4,1*10-3,1*10-2). And after all the picture samples in the picture data set are traversed, obtaining a conversion matrix A which is finally optimized. At this time, the training of the transformation matrix is finished, and the transformation matrix A is output.
II, Hash coding process:
as shown in fig. 2, in the hash encoding process, a similarity matrix Z reflecting the optimized similarity relationship between the sample and the sampled picture set is constructed mainly from the conversion matrix obtained in the previous step. And then, carrying out hash coding on each picture in the large-scale picture data set by utilizing a hash technology. And (3) searching an approximate picture of a new picture in the data set, comparing the Hamming distance of binary coding between the pictures, and if the Hamming distance is smaller than a set threshold value r, determining that the two pictures are approximate pictures.
Step 1: after the conversion matrix A is obtained, the optimized similarity of each picture and the sampled picture after introducing the semantic consistency is calculated through the formula (1). I.e. the values of the individual elements of the relation matrix W are obtained. If m picture samples exist in the sampling picture set, a similarity matrix Z required by the Hash coding technology can be constructed through the relation matrix. The Z matrix calculation formula is defined as follows:
Figure BDA0001302219730000062
where the < i > set represents a sampled picture set. That is, the value of the corresponding element on the Z matrix is calculated only when the picture belongs to the sampled picture set, otherwise the value of the corresponding element on the Z matrix is 0.
Step 2: and (5) setting the number of the pictures in the sampling picture set as M, and constructing an M-by-M dimensional M matrix. The M matrix is defined as follows:
M=Λ-1/2ZT-1/2 (5)
wherein Λ ═ diag (Z)T1) Is a diagonal matrix. Calculating a k x k dimensional diagonal matrix consisting of the first k largest eigenvalues of the M matrix: sigma is diag (delta)1,...,δk)∈Rk×kAnd an m x k dimensional matrix composed of eigenvectors corresponding to the first k largest eigenvalues: v ═ V1,...,vk]∈Rm×k
And step 3: constructing a final encoding matrix Y from the matrices obtained by the above formula, wherein the Y matrix is defined as follows:
Figure BDA0001302219730000063
y is a matrix with dimension n x k, n represents the number of pictures in the picture set, and k represents the number of coded bits when mapping to binary coding. Each row of the coding matrix Y is a coding function, each picture is calculated through the coding function to obtain a k-dimensional vector, and then the vector is subjected to binarization segmentation: sgn (y). A binary code for each picture in the picture data set is obtained.
And 4, step 4: if a new query picture needs to be searched for an approximate picture, the binary coding of the query picture is calculated by using the coding function. The encoding of the query picture is then compared to the hamming distances of all picture encodings in the picture data set. Defining a Hamming distance threshold value r (the value of r can be selected as 1,2,3 and 4), and if the Hamming distance between a query picture and a picture is less than the threshold value r, the picture is considered to be an approximate picture of the query picture. And traversing the picture data set to find all approximate pictures of the query picture.
The overall method flow of the invention is as follows:
step 1: inputting a sample matrix X (X is a matrix with n X d dimensions, n is the number of pictures, the value of n can be large, and d is the dimension of picture characteristics) of the picture data set, and inputting a semantic class label matrix Y (Y is a matrix with n X c dimensions, n is the number of samples, and c is the number of class labels) corresponding to the picture data set.
Step 2: a part of pictures are randomly extracted from the picture set to serve as a sampling picture set, and the purpose of selecting the sampling picture set is to greatly reduce the calculation time overhead and improve the algorithm efficiency by calculating the similarity between the pictures and the sampling pictures.
And step 3: and for each picture in the picture data set, introducing semantic consistency to construct an object function expression O (A), wherein A (A is a matrix of d-d dimensions, and d is the dimension of picture features) is a conversion matrix required in an encoding stage. And (5) iterative solution is carried out through a random gradient descent algorithm, and the optimized transformation matrix A is obtained after the expression is converged.
And 4, step 4: for each picture sample x, the similarity between the picture sample x and the sampled picture is multiplied by the conversion matrix a. The optimized similarity after semantic consistency is introduced is obtained. Then, the pictures are encoded by using a Hash technology, and the original d-dimensional features of the pictures are compressed and mapped into k-dimensional binary codes.
And 5: for a new query picture q, its approximate picture is found. Firstly, the conversion matrix A obtained by training in the step 3 is multiplied by the similarity between the picture q and the sampling picture. The optimized similarity after semantic consistency is introduced is obtained. And calculating the binary code of the query picture through a coding function. The hamming distance is compared to the binary encoding of each picture in the picture data set. And if the Hamming distance is smaller than a set threshold value r, the two pictures are considered to be approximate pictures.
By adopting the technical implementation scheme, compared with the prior art, the invention solves the problems as follows:
(1) the traditional approximate search algorithm training process does not introduce the problem of poor performance caused by semantic consistency: many conventional algorithms for searching for image neighbors are too unilateral when searching for query image neighbors, and semantic information possibly possessed by images is not considered when searching for query image neighbors, so that the performance of the algorithms in practical application of image approximate search is poor. The semantic consistency characteristic is introduced when the similarity measurement is carried out on the pictures, so that the similarity between the pictures can be measured more accurately. The method can be effectively applied to the realistic picture approximate search.
(2) And calculating the optimized similarity by using the abstract picture set. The problem that the similarity of large-scale image data set calculation is too slow is solved: in a large-scale picture data set, if the traditional measurement method for calculating the similarity between pictures is used, the time cost is very large, and the method is not feasible in practical application. The method randomly extracts a few pictures from a mass picture set as a sampling picture set, and only calculates the optimized similarity between the pictures and the sampling picture set. The time overhead of the algorithm is greatly reduced, and the algorithm efficiency is improved.
(3) The problem that the target function converges too slowly is solved by using a random gradient descent algorithm: the original gradient descent algorithm is called batch gradient descent algorithm, and the algorithm is to minimize the loss function of all training data, so that the final solution is a global optimal solution, i.e. the solved parameters are the parameters which minimize the loss function value. However, each iteration of the batch gradient algorithm requires all data in the training set, and if the number of pictures in the data set is large, the use of the batch gradient algorithm is very slow. The random gradient descent algorithm only uses one data sample when the iterative update is carried out once, and the speed is high. The speed advantage is more pronounced, especially for large-scale picture data sets. Moreover, for the target loss function, convergence can be achieved without traversing the entire data set using a random gradient descent algorithm. The invention replaces the batch gradient algorithm with the random gradient descent algorithm to iteratively solve the target function of the algorithm, thereby solving the problem of slow convergence of the algorithm.
In summary, the present invention uses a transformation matrix reflecting the intra-picture semantic consistency to calculate the optimized similarity between pictures. In order to improve the search efficiency, a part of pictures are randomly selected from a large-scale picture set to be used as a sampling picture set to measure the similarity between the pictures, and the training time of the algorithm is reduced by adopting a random gradient descent method when a conversion matrix is trained. And after a similarity matrix for coding is obtained through the optimized similarity between the pictures, mapping the original picture into k-bit binary coding by using a Hash coding technology. When searching the neighbor of a new query picture, firstly obtaining the binary code of the query picture through the coding function of the model, and then comparing the Hamming distance between the codes with all pictures in the picture set. Certain pictures are considered to be approximate pictures of the query picture when the hamming distance between them is less than a given hamming distance threshold.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (4)

1. The approximate searching method based on semantic consistency in the large-scale picture set is characterized by comprising the following steps of: the method comprises the following steps:
step 1: inputting a picture set sample matrix X and a semantic class mark matrix Y corresponding to a picture set, wherein X is a matrix with dimensions of n X d, Y is a matrix with dimensions of n X c, n represents the number of pictures in the picture set, d is the dimension of picture features, and c is the number of class marks;
step 2: randomly extracting a part of pictures from the picture set as a sampling picture set;
and step 3: defining a relation matrix W between pictures in the picture set and the sampling picture set, combining the relation matrix and introducing semantic consistency to construct a target function expression, iteratively solving optimization through a random gradient descent algorithm, and obtaining an optimized conversion matrix A after the expression is converged; the calculation process of the transformation matrix a is as follows:
step (1), defining a relation matrix W between pictures, wherein each element in the relation matrix is defined as:
Wij=exp(-||A(xi-uj)||2) (1)
in the above formula, A represents a transformation matrix, xiIndicates the ith picture in the picture set, ujRepresenting the jth picture in the sampled picture set;
step (2), defining an objective function formula as follows:
Figure FDA0002770353310000011
wherein f isiClass mark vector representing ith picture sample, the class mark vector is c-dimensional column vector, the value of element in the vector is 1 or 0, respectively representing that the picture belongs to the class and does not belong to the class, fjA class tag vector representing a jth picture in the sampled picture set;
step (3), optimizing the transformation matrix A according to a random gradient descent algorithm, wherein an iteration updating rule is as follows:
Figure FDA0002770353310000012
wherein, γtThe optimization step length in each iteration process is obtained, the initial value of a conversion matrix is I/delta, I is a unit matrix of d x d dimensions, and delta is the median of Euclidean distances between pictures in a picture set;
step (4), after all the picture samples in the picture set are traversed, obtaining a conversion matrix A which is finally optimized;
and 4, step 4: for each picture sample x, substituting the conversion matrix A into the relation matrix defined in the step 3 to obtain the value of each element of the relation matrix; constructing a similar matrix Z through the relation matrix, obtaining an encoding matrix by combining the similar matrix, carrying out Hash encoding on each picture in the large-scale picture data set by using the encoding matrix, and compressing and mapping the original d-dimensional features of the pictures into k-dimensional binary codes;
and 5: and for a new query picture q, calculating the binary code of the query picture through the coding matrix, comparing the Hamming distance with the binary code of each picture in the picture data set, and if the Hamming distance is smaller than a set threshold value r, considering that the two pictures are approximate pictures.
2. The approximate search method according to claim 1, wherein: step size gammatSelecting one of the following values: 1*10-5,1*10-4,1*10-3Or 1 x 10-2
3. The approximate search method according to claim 1, wherein: the step 4 is as follows:
step a, after obtaining a conversion matrix A, calculating the optimized similarity after introducing semantic consistency between each picture and a sampled picture by using a formula (1), namely obtaining the value of each element of a relation matrix W, if m picture samples are collected in the sampled picture set, constructing a similar matrix Z by using the relation matrix,
the Z matrix calculation formula is defined as follows:
Figure FDA0002770353310000021
wherein the < i > set represents a sampled picture set, i.e. the value of the corresponding element on the Z matrix is calculated only when the picture belongs to the sampled picture set, otherwise the value of the corresponding element on the Z matrix is 0;
b, setting the number of the pictures in the sampling picture set as M, and constructing an M matrix with M x M dimensions, wherein the M matrix is defined as follows:
M=Λ-1/2ZT-1/2 (5)
wherein Λ ═ diag (Z)T1) The method is a diagonal matrix, and the diagonal matrix of l x l dimensions consisting of the first l maximum eigenvalues of the M matrix is obtained by calculation: sigma-diag (delta)1,...,δl)∈Rl×lAnd an m x l dimensional matrix formed by eigenvectors corresponding to the first l largest eigenvalues: v ═ V1,...,vl]∈Rm×l
C, constructing a final coding matrix Y from the matrixes obtained by the formulat,YtThe matrix is defined as follows:
Figure FDA0002770353310000022
Ytis a matrix with dimension n x k, n represents the number of pictures in the picture set, k represents the coded digit when mapping to binary code, and the coding matrix YtEach line of the image is a coding function, each image is calculated through the coding function to obtain a k-dimensional vector, and then the vector is subjected to binarization segmentation: sgn (Y)t) The binary coding of each picture in the picture set is obtained.
4. The approximate search method according to claim 1, wherein: r is selected to be one of the following values: 1,2,3, or 4.
CN201710368677.4A 2017-05-23 2017-05-23 Approximate searching method based on semantic consistency in large-scale picture set Expired - Fee Related CN107133348B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710368677.4A CN107133348B (en) 2017-05-23 2017-05-23 Approximate searching method based on semantic consistency in large-scale picture set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710368677.4A CN107133348B (en) 2017-05-23 2017-05-23 Approximate searching method based on semantic consistency in large-scale picture set

Publications (2)

Publication Number Publication Date
CN107133348A CN107133348A (en) 2017-09-05
CN107133348B true CN107133348B (en) 2021-04-30

Family

ID=59732328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710368677.4A Expired - Fee Related CN107133348B (en) 2017-05-23 2017-05-23 Approximate searching method based on semantic consistency in large-scale picture set

Country Status (1)

Country Link
CN (1) CN107133348B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509651B (en) * 2018-04-17 2019-03-12 胡海峰 The distributed approximation searching method with secret protection based on semantic consistency
CN115098721B (en) * 2022-08-23 2022-11-01 浙江大华技术股份有限公司 Face feature retrieval method and device and electronic equipment
CN116842030B (en) * 2023-09-01 2023-11-17 广州尚航信息科技股份有限公司 Data synchronous updating method and system of server

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324650A (en) * 2012-10-23 2013-09-25 深圳市宜搜科技发展有限公司 Image retrieval method and system
CN103530812A (en) * 2013-07-25 2014-01-22 国家电网公司 Power grid state similarity quantitative analyzing method based on locality sensitive hashing
CN104462196A (en) * 2014-10-30 2015-03-25 南京信息工程大学 Multi-feature-combined Hash information retrieval method
US20150161178A1 (en) * 2009-12-07 2015-06-11 Google Inc. Distributed Image Search
US20160191950A1 (en) * 2006-10-13 2016-06-30 Thomson Licensing Reference Picture List Management Syntax for Multiple View Video Coding
CN105825237A (en) * 2016-03-21 2016-08-03 山东联科云计算科技有限公司 Subgraph similarity query method based on graph measurement

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160191950A1 (en) * 2006-10-13 2016-06-30 Thomson Licensing Reference Picture List Management Syntax for Multiple View Video Coding
US20150161178A1 (en) * 2009-12-07 2015-06-11 Google Inc. Distributed Image Search
CN103324650A (en) * 2012-10-23 2013-09-25 深圳市宜搜科技发展有限公司 Image retrieval method and system
CN103530812A (en) * 2013-07-25 2014-01-22 国家电网公司 Power grid state similarity quantitative analyzing method based on locality sensitive hashing
CN104462196A (en) * 2014-10-30 2015-03-25 南京信息工程大学 Multi-feature-combined Hash information retrieval method
CN105825237A (en) * 2016-03-21 2016-08-03 山东联科云计算科技有限公司 Subgraph similarity query method based on graph measurement

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
大规模机器学习中锚图哈希算法的研究;高珊;《中国优秀硕士学位论文全文库信息科技辑》;20150915(第09期);第1-52页 *

Also Published As

Publication number Publication date
CN107133348A (en) 2017-09-05

Similar Documents

Publication Publication Date Title
WO2022068196A1 (en) Cross-modal data processing method and device, storage medium, and electronic device
CN111198959B (en) Two-stage image retrieval method based on convolutional neural network
CN105912611B (en) A kind of fast image retrieval method based on CNN
CN108334574B (en) Cross-modal retrieval method based on collaborative matrix decomposition
CN102254015B (en) Image retrieval method based on visual phrases
CN107766555B (en) Image retrieval method based on soft-constraint unsupervised cross-modal hashing
US8571306B2 (en) Coding of feature location information
CN110929080B (en) Optical remote sensing image retrieval method based on attention and generation countermeasure network
CN106033426B (en) Image retrieval method based on latent semantic minimum hash
CN109657112B (en) Cross-modal Hash learning method based on anchor point diagram
CN109271486B (en) Similarity-preserving cross-modal Hash retrieval method
CN112925962B (en) Hash coding-based cross-modal data retrieval method, system, device and medium
Wei et al. Projected residual vector quantization for ANN search
CN108388656B (en) Image searching method based on mark correlation
Liu et al. Towards optimal binary code learning via ordinal embedding
CN104199842A (en) Similar image retrieval method based on local feature neighborhood information
CN111125411A (en) Large-scale image retrieval method for deep strong correlation hash learning
CN107133348B (en) Approximate searching method based on semantic consistency in large-scale picture set
CN106815362A (en) One kind is based on KPCA multilist thumbnail Hash search methods
CN103473307A (en) Cross-media sparse Hash indexing method
CN112256727B (en) Database query processing and optimizing method based on artificial intelligence technology
EP3115908A1 (en) Method and apparatus for multimedia content indexing and retrieval based on product quantization
CN112163114B (en) Image retrieval method based on feature fusion
CN112214623A (en) Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method
CN104899326A (en) Image retrieval method based on binary multi-index Hash technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210430