CN110134803B - Image data quick retrieval method based on Hash learning - Google Patents

Image data quick retrieval method based on Hash learning Download PDF

Info

Publication number
CN110134803B
CN110134803B CN201910415146.5A CN201910415146A CN110134803B CN 110134803 B CN110134803 B CN 110134803B CN 201910415146 A CN201910415146 A CN 201910415146A CN 110134803 B CN110134803 B CN 110134803B
Authority
CN
China
Prior art keywords
hash
query
image
hash code
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201910415146.5A
Other languages
Chinese (zh)
Other versions
CN110134803A (en
Inventor
王红滨
纪斯佳
张毅
周连科
王念滨
童鹏鹏
崔琎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN201910415146.5A priority Critical patent/CN110134803B/en
Publication of CN110134803A publication Critical patent/CN110134803A/en
Application granted granted Critical
Publication of CN110134803B publication Critical patent/CN110134803B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A quick retrieval method of image data based on Hash learning relates to a quick retrieval method of image data, and belongs to the technical field of data retrieval. The method aims to solve the problem that the negative feedback process of the model in the training stage is deviated due to the fact that multiple times of relaxation are used in the hash code generation stage of the existing model. The deep hash model comprises five convolution-pooling layers, two full-connection layers, a characteristic layer, a hash layer and an output layer; training based on triple constraints to obtain a trained deep hash model, and then establishing a sample library by using the deep hash model, wherein the sample library consists of image samples and corresponding hash codes; aiming at the query image, generating a hash code of the query image by using a trained deep hash model; and searching by utilizing the hash code of the query image and the image sample library. The invention is suitable for image data retrieval.

Description

Image data quick retrieval method based on Hash learning
Technical Field
The invention relates to a method for quickly retrieving image data, and belongs to the technical field of data retrieval.
Background
In recent years, with the rapid development of the internet, high-dimensional data shows exponential development, and how to utilize the data becomes a focus of various industries. Researchers have proposed many methods for large-scale data retrieval in the past, and the hash method has been widely used with its efficient storage and computation efficiency (LI WuJun, ZHOU ZhiHua. big data hash learning: status and trend [ J ] scientific notice, 2015, 60 (Z1): 485-) 490.). The traditional hash method comprises locality sensitive hash and spectral hash, achieves certain achievement on image retrieval, and still has a certain distance from practical application. The rapid development of deep learning advances the progress of the hash method, and in 2014, a Convolutional Neural Network hash model (CNNH) (r.xia, y.pan, h.lai, et al.superior hash for image temporal image representation learning [ C ]. AAAI reference on intelligent learning, 2014) is proposed by combining panitis and facial water for the first time with a Convolutional Neural Network, and compared with the traditional hash method, a better effect is obtained. CNNH is divided into two stages to train the hash code, the first step is to decompose a similar matrix S, each element in the matrix S indicates whether the sample images of the row and the column of the element are similar, and each row of the matrix H is the approximate hash code of the training data. The image features of the images in the model expressed in the training process cannot be used for generating the hash codes in a reaction mode, the Hamming distance between the hash codes cannot be adjusted dynamically, and the learned hash function is suboptimal due to the fact that the advantages of the convolutional neural network cannot be utilized. On the basis, Li W J, Wang S, Kang W C et al propose a Deep Neural Network Hash model (Deep Neural Network Hash, DNNH), H.Liu, R.Wang, S.Shan et al propose a Deep Supervised Hash model (Deep Supervised Hash, DSH) in Deep Learning based Deep Supervised Hash with Pairwise Labels. The two models adopt an end-to-end model, the problems of CNNH feature extraction and Hash coding separation are essentially overcome, and loss functions are designed from different angles for generating the Hash codes. However, the two models use multiple relaxations in the hash code generation stage, which causes the negative feedback process of the models in the training stage to be biased, and thus, the generated trained models are not accurate enough for image data retrieval.
Disclosure of Invention
The invention provides a fast image data retrieval method based on hash learning, which aims to solve the problem that the negative feedback process of a model in a training stage is deviated due to the fact that the existing model uses multiple times of relaxation in a hash code generation stage.
The invention discloses a quick retrieval method of image data based on Hash learning, which comprises the following steps:
step 1, establishing a deep hash model:
the deep hash model comprises five convolution-pooling layers, two full-connection layers, a characteristic layer, a hash layer and an output layer;
step 2, training a deep Hash model:
the training data is a series of labeled data sets { (p)1,w1),(p2,w2),(p3,w3),...(pn,wn) In which p isiAs a sample image, wiIs a label corresponding to the image sample;
the input is a triplet label { p }i,pj,pkIn which p isiAnd pjAre of the same class, piAnd pkAre of different classes, the similarity distance between the same classes is less than the similarity between the different classes;
after a trained deep hash model is obtained, establishing a sample library by using the deep hash model, wherein the sample library consists of image samples and corresponding hash codes;
step 3, aiming at the query image, generating a hash code of the query image by using the trained deep hash model;
and 4, retrieving by utilizing the hash code of the query image and the image sample library.
Further, the process of searching by using the hash code of the query image and the image sample library includes the following steps:
let the hash code corresponding to the image in the sample library be pi={hi,1,hi,2,hi,3,…,hi,mH, the hash code corresponding to the query image is pquery={hquery,1,hquery,2,hquery,3,…,hquery,mAnd (4) in the Hamming space, the neighbor of the query image is represented as NN (p)query,)={p|||pquery-pi||2<};
Through | pquery-pi||2< get the neighbor set p of the query sample;
statistical nearest neighbor query sample setThe bit S with a larger proportion of '0' or '1' in each bit of all hash codes in the p-sum is ═ S1,S2,S3,…,Sm},Si∈{0,1};
Counting the probability P (S) of 0 in each bit of all hash codes in the nearest neighbor query sample set P0)={P(S1,0),P(S2,0),P(S3,0),…,P(Sm,0) In which P (S)i,0)∈[0,1];
P(Si,0)=∑(Si=0,NN(pquery,))/count(NN(pqueryB) for each bit in the hash code, count is the number of "0" in a bit that satisfies the hash code;
counting the probability P (S) of 1 in each bit of all hash codes in the nearest neighbor query sample set P1)={P(S1,1),P(S2,1),P(S3,1),…,P(Sm,1) In which P (S)i,1)∈[0,1];
P(Si,1)=∑(Si=1,NN(pquery,))/count(NN(pqueryB) for each bit in the hash code, count is the number of "1" in a bit that satisfies the hash code;
through omegai=1+max(P(Si,0),P(Si,1) M determines the weight omega of each bit of the hash code { omega ═ omega123,…,ωm};
Then, the weight of each bit of the hash code is utilized to calculate the hash code p corresponding to the image in the sample libraryiHash code p corresponding to query imagequeryWeighted hamming distance of
Figure BDA0002064098060000021
The images and order of querying the query image into the database is determined by weighting the hamming distances.
Further, 2 in the neighbor.
Further, the length of the characteristic layer is 4 times of the length of the hash code.
Further, the hash layer uses a pair constraint as a constraint condition, and the stage inputs special charactersFeature vectors for feature layers, i.e. input as p for constraintsi,pj,wij};wijWhen 1 indicates that the samples represented by the two eigenvectors are homogeneous, wijWhen 0, the samples represented by the two feature vectors are not classified; the feature vector generated by the feature layer is Fi,Fj∈RdThe output mapped to the Hash space is bi, bj belongs to { -1,1}mThen distH(bi,bj) Hamming space between bi and bj; the loss function is as follows:
Figure BDA0002064098060000031
after the characteristic vector of the sample image passes through the hash layer, the characteristic vector passes through a tanh function before generating a hash code, and the last hash code is b which is the final hash code after a relaxation variable passesiAnd bjThe value before relaxation is uiAnd uj,ui,uj∈Rm(ii) a Using the pre-relaxation variable u in the calculation of the loss functioniAnd ujIn place of the hash code biAnd bjThe loss function is:
Figure BDA0002064098060000032
in the formula, m is the length of the hash code, and α is the preference norm weight.
Further, in the process of training the deep hash model, small-scale data is adopted for online training, and the small-scale triples are created according to the following rules: (1) determining the number of selected samples of different labels from the small batch, and selecting the least number of label samples; (2) randomly shuffling a certain label, and selecting i and i +1 in a sample as anchors p of the triplesiAnd positive example pj(ii) a (3) Randomly selecting other label samples i as triples piNegative example p ofk(ii) a (4) All tags and all samples are cycled through, generating a random combination containing anchors, positive examples, and negative examples.
The most prominent characteristics and remarkable beneficial effects of the invention are as follows:
according to the invention, through completing the comparison experiment of the deep hash network structure and the rearrangement algorithm, the deep hash network structure provided by the invention is verified to have better superiority, and the rearrangement algorithm has better visual effect on image retrieval based on hash codes. In the traditional research of the hash function, the similarity is generally compared by comparing Hamming distance modes, and the Hamming distance discrimination is insufficient when the data scale is large. Experiments show that the method based on Hash learning has better performance on CIFAR-10 and NUS-WIDE compared with other methods.
Drawings
FIG. 1 is a deep hash network architecture;
FIG. 2 is a diagram illustrating different classifications of hash codes in a neighborhood set;
FIG. 3 is a comparison graph of CIFAR-10 results;
FIG. 4 is a NUS-WIDE comparison graph;
FIG. 5 is a visual experimental result;
FIG. 6 is a visualization experiment result of the rearrangement algorithm;
FIG. 7 is a 24bit accuracy.
Detailed Description
The first embodiment is as follows:
the image data fast retrieval method based on the hash learning of the embodiment specifically comprises the following steps:
1. establishing a deep hash model:
the deep hash model comprises five convolution-pooling layers, two full-connection layers, a characteristic layer, a hash layer and an output layer; outputting a feature vector with a certain length by the feature layer; the feature vector is then mapped to a hash code by a hash layer. The model structure is shown in fig. 1, and the specific parameters are shown in table 1.
TABLE 1 model parameters
Figure BDA0002064098060000041
Each fully connected layer structurally consists of a single layer of 500 x 1 neurons and activation functions. The full connection layer is used for connecting each feature of the middle feature layer, and the relation between the extracted features is corresponding to different bits of the hash code through the hash layer, so that the hash code with a large Hamming distance is generated by different sample images.
2. Training a deep hash model:
the training data is a series of labeled data sets { (p)1,w1),(p2,w2),(p3,w3),...(pn,wn) In which p isiSample image, wiIs a label corresponding to the image sample;
triple tag { pi,pj,pkDenotes a ternary constraint, representing the proximity between samples, p under a certain metriciAnd pjIs less than piAnd pkThe distance between them; the ternary constraint has better classification effect in the actual training of the invention and better adaptability to the model. The input is a triplet label { p }i,pj,pkIn which p isiAnd pjAre of the same class, piAnd pkAre of different classes, the similarity distance between the same classes is less than the similarity between the different classes;
generally, all training data are selected to be combined, but the training efficiency of the model is very inefficient, and error samples in the training samples can mislead the generation of the model. In order to ensure that the model can be converged to the ternary constraint condition quickly, the invention uses small-scale data to carry out large-batch online training, for example, 40 small-scale sample images are selected each time, and a ternary group is established for the samples. The creation of small-scale triplets follows several rules: (1) from small batchesDetermining the number of selected samples of different tags, and selecting the least number of tag samples; (2) randomly shuffling a certain label, and selecting i and i +1 in a sample as anchors p of the triplesiAnd positive example pj(ii) a (3) Randomly selecting other label samples i as triples piNegative example p ofk(ii) a (4) All tags and all samples are cycled through, generating a random combination containing anchors, positive examples, and negative examples. Under the condition of the rule, the sample data is ensured to be uniformly distributed, and the randomness is increased.
Characteristic layer: the convergence condition of the deep hash network is that the feature vector output by the training data in the feature layer meets the triple constraint condition, and the constraint condition can enable the model to be extracted to have more expressive features. The triple constraint is applied to feature extraction, namely, the Euclidean distance between feature vectors of the same type of samples is smaller than that between different types of samples, and the formula is as follows:
Figure BDA0002064098060000051
in the formula
Figure BDA0002064098060000052
The anchor samples are represented as a sample of the anchor,
Figure BDA0002064098060000053
a sample of a positive case is shown,
Figure BDA0002064098060000054
representing negative example samples, f is a mapping function obtained by learning (mapping samples from sample images to feature vectors), threshold represents the distance that a particular threshold is used to control the positive and negative samples, and | | · | | represents the euclidean distance between feature vectors. In the formula, when the intra-class distance is smaller than the inter-class distance, the error is 0, when the intra-class distance is not smaller than the inter-class distance, the error exists, and the formula is represented by using a plus sign.
The lower the value of threshold in the training phase, the Loss function LosstripletIt is easier to go to 0, the distance between the anchor and the positive example is not too close, the anchor and the negativeThe distance between the examples is not too far, but the resulting model is more difficult to converge at this time. When the threshold is large, the model is close to the distance between the anchor and the positive example, and the distance between the anchor and the negative example is far, so that the Loss function Loss of the modeltripletThe value is kept at a large value, so that a reasonable threshold value is particularly critical for the training of the model. The deep hash network uses a ternary Loss function for constraint at a feature layer, namely, by minimizing LosstripleAnd t, carrying out negative feedback network, and adjusting parameters in the network to obtain more expressive characteristics.
And (3) Hash layer: using the pair constraint as constraint condition in the hash layer, the input of the stage is the feature vector of the feature layer, i.e. the input is the { p of the pair constrainti,pj,wij};wijWhen 1 indicates that the samples represented by the two eigenvectors are homogeneous, wijWhen 0, it means that the samples represented by the two feature vectors are not in the same class. Feature vector F generated by feature layeri,Fj∈RdThe output mapped to the Hash space is bi, bj belongs to { -1,1}mThen distH(bi,bj) Hamming space between bi and bj; the loss function is as follows:
Figure BDA0002064098060000061
wherein m is the length of the hash code;
dividing the loss function by m can control the loss function to be between 0 and 1, regardless of the hash length. If not divided by m, the longer the hash code length is, the greater the loss will be, which can make the result more accurate.
When w isijWhen 1, for LosspairB is reduced as much as possible when the derivative is lowered in gradientiAnd bjHamming distance between them to reduce LosspairWhen w is a value ofijWhen 0, the hamming distance between bi and bj is increased. Using the loss function as a constraint condition to generate hash codes for samples of the same classThe Hamming distance between the samples is short, the Hamming distance between the Hash codes generated by the samples of different classes is long, and the Hash codes obtained by the method are optimal.
Dist in formulaH(bi,bj) The function is discretized, and random gradient descent cannot be performed by a traditional method due to the problem that the gradient cannot be guided, namely, the model parameters cannot be adjusted reversely. In order to solve the problem that the loss function cannot be derived, the feature vector of the sample image passes through the tanh function after passing through the hash layer and before generating the hash code, and the tanh has the advantages that the real value is compressed between (-1, 1), and when the value is larger than the gradient value around 0, the values can be distributed around-1 and 1 as much as possible, so that the generation of the hash code is facilitated. Knowing that the last hash code is b that is the last one after the variable is relaxediAnd bjSo that the value before relaxation is uiAnd uj,ui,uj∈Rm. In order to make the function derivable during training, the pre-relaxation variable u is used in the calculation of the loss functioniAnd ujIn place of the hash code biAnd bjAnd in order to prevent the overfitting of the model in the training process and improve the generalization capability of the model, a regular term is added after the loss function. The loss function used in the actual training process is:
Figure BDA0002064098060000062
in the formula, α is a preference norm weight, when α → 0, the model is easy to be over-fitted, and when α → ∞, the model is under-fitted, so that an appropriate α value is also important for training the model.
After a trained deep hash model is obtained, establishing a sample library by using the deep hash model, wherein the sample library consists of image samples and corresponding hash codes;
3. aiming at the query image, generating a hash code of the query image by using a trained deep hash model;
4. and searching by utilizing the hash code of the query image and the image sample library.
The second embodiment is as follows:
the process of searching by using the hash code of the query image and the image sample library in the embodiment comprises the following steps:
the hash function obtained by the deep hash model can enable each sample image in the sample library to have a unique hash code h1,h2,…,hm},hiE {0,1 }. When similar images of the query sample q are to be retrieved, the hamming distance calculation formula with the images in the sample library is:
Figure BDA0002064098060000071
in the formula, distH(hi,hj) Is the hamming distance and m is the length of the hash code. It can be known from the formula that each bit in the hash code has the same effect, and in the process of generating the hash code, each bit of the hash code is a single feature or a combination of a plurality of features, and is ignored when searching by using the hamming distance. In addition to the inability to perform expressive features, in searching for images, the search results for the same hamming distance cannot be further divided, making the search results less accurate. Therefore, the invention assigns each hash code with a specific feature weight ωiWhen the Hamming distance is calculated, the weighted Hamming distance is used for calculation, so that the similarity between the query sample and the data of the sample library can be refined, and the retrieval return result has higher similarity with the query sample. In this embodiment, each bit of the hash code may be given a specific weight, and it is assumed that the weight of each bit of the hash code in a certain class is ω ═ ω { (ω })123,…,ωmAnd then the weighted hamming distance is defined as follows:
Figure BDA0002064098060000072
compared with the dispersion of the Hamming distance, the weighted Hamming distance has smaller similarity measurement granularity, and the similarity between the same Hamming distance can be further divided. The invention provides a brand new weighting method, and the design of the weight of each bit of the hash code is introduced in detail in the subsequent process.
Let the hash code corresponding to the image in the sample library be pi={hi,1,hi,2,hi,3,…,hi,mH, the hash code corresponding to the query image is pquery={hquery,1,hquery,2,hquery,3,…,hquery,mAnd then in Hamming space, the neighbors of the query image are represented as
NN(pquery,)={p|||pquery-pi||2<}
Since the hamming distance is simple and efficient for image retrieval, the simplicity and the efficiency of the hamming distance weight are kept when the hamming distance weight is designed. The feature weight proposed by the present invention is based on neighbors in hamming space, a set p of neighbor samples is retrieved by hamming distance before weighted hamming distance calculation, the hamming distance between the hash code of all samples in the set p and the hash code of the query sample is smaller than that, but the hash codes in the set are different, as shown in fig. 2. How to determine the weight of the hash code with different bits is to count all samples in the set p, count the probabilities of "0" and "1" on each bit, and then calculate the weighted hamming distance between the query sample and the sample library on the sample set by using the probability.
For all the hash codes generated by the set of sample data, let P (1)iFor the probability that the ith bit of the hash code is '1', P (0)iFor the probability that the ith bit of the hash code is "0", the following relationship exists:
P("1")i+P("0")i=1
from this relationship, it is clear that the feature aggregation is more pronounced in the sample set p, and most hash codes in the samples are biased to be definite in a certain bit. For example, when a cat sample is mainly identified by "ears", when cat data in the sample library has ear characteristics, the cat data has high consistency in certain code bit expression. When weight is designed, this location is more important than other locations. When the weight of each bit of the hash code is calculated, the weight is updated according to the importance degree of the hash code.
The weight ω is calculated as follows.
Figure BDA0002064098060000081
(2) The process acts to characterize bit. The greater the separation degree of a certain bit in all the hash codes, the stronger the bit feature expression. For example, now there are 10 hash codes of length 12, for the first bit of all the hash codes there are 9 '1's and 1 '0's, and the second bit has 5 '1's and 5 '0's, then the first bit will be weighted more heavily than the second bit.
The calculation process of the weight omega shows that the weight is mainly used for distinguishing samples with the same Hamming distance, and the relation between the Hamming distance and the similarity is fundamentally kept. The relationship between hamming distance and weighted hamming distance is as follows:
Figure BDA0002064098060000091
in the formula, the first step is that,
Figure BDA0002064098060000092
for weighting the Hamming distance, the weighted Hamming distance further refines the division rule on the basis of not destroying the efficiency of the Hamming distance, and overcomes the problem of the same distance ordering to a certain extent.
Examples
Experiments are carried out by utilizing CIFAR-10(A. Krizovsky, G.Hinton. learning Multiple Layers of Features from Tiny Images [ J ].2012.) and NUS-WIDE (Zhang P, Zhang W, Li W J, et al. Supervised bathing with patient factor [ M ].2014.) data sets, and the effective reliability of experimental comparison is ensured. Experiment 600 image samples are extracted from each class in the CIFAR-10 dataset as experimental data, wherein 500 image samples are used as training data, and the other 100 image samples are used as test data. Since the NUS-WIDE dataset is a multi-label dataset, two sample images are considered to be sample data of the same type if they have one identical label. In the experiment, the average mAP of the first 5000 returned samples was taken as the final comparison data using the same calculation method as the others. From the results, it can be seen that FastH, CNNH, NINH combined with deep neural networks have better accuracy than the conventional methods. In CNNH, the hash code used for fitting through the deep neural network is suboptimal compared to hash codes obtained by other hash learning methods. Compared with experiments, the deep hash method provided by the subject has a better experimental effect, and the metric mAP of the data is higher and higher along with the increase of the length of the hash code. As shown in table 2, the deep hash model proposed by the present invention is improved to some extent compared with other methods. Compared with the traditional hash method, the method has obvious improvement effect on LSH, SH and ITQ. Compared with other hash learning methods, such as FastH, CNNH and NINH, the method is improved on CIFAR-10 data sets and NUS-WIDE data sets, and the excellent performance of the deep hash model on hash coding is verified.
TABLE 2 comparison of dataset retrieval accuracy (mAP) results
Figure BDA0002064098060000093
Figure BDA0002064098060000101
As can be seen from table 2, compared with the experiment results of the deep hash network model proposed by the present subject, the data set of CIFAR-10 is improved significantly, and the improvement is 3.8%, 3.5%, 5.0% and 5.1% respectively in different bit hash codes. The different bit hash code improvements in the NUS-WIDE dataset are 5%, 6.8%, 5.4% and 6.8%, respectively. Through comparison experiments, the hash codes with different lengths in different data sets are improved to a certain extent.
The characteristic extraction uses a ternary loss function to extract the characteristics of the image, the length of the extracted characteristics in the experiment is also a key factor influencing the generation of the hash code, the characteristics are easy to be over-fitted in a hash layer due to short length, and interference characteristics can be extracted due to long characteristic length to influence the generation of the hash code. In order to obtain the optimal value of the characteristic layer length, the influence of different characteristic lengths on the final mAP result is compared. The experiment lengths selected in the experiment are related to the lengths of the finally generated hash codes, the lengths of 'L', '2X L', '3X L', '4X L' and '5X L' are respectively selected in the experiment to be compared, wherein 'L' is the length of the finally generated hash codes, and broken lines in a comparison graph respectively represent hash code results with different lengths.
Analysis of the line graphs (fig. 3 and 4) of the two data sets can be performed to obtain the hash code directly through processing at the layer when the length of the characteristic layer is 1 times of the hash code, but the result is general. With the increase of the length of the characteristic layer, when the length of the characteristic layer is 4 times of the length of the hash code, the effect is better, and a part of data mAP can be slightly reduced when the length of the characteristic layer is increased. So the test is the optimal characteristic layer length in the experiment.
In the visualization experiment, image retrieval is mainly carried out through a CIFAR-10 data set, the data set is a single-label data set, each sample image contains less information, certain characteristics can be more accurately represented, and retrieval return results are more visually displayed. The experimental principle is to return TOP-K samples with the minimum Hamming distance from the hash code of the retrieval sample, the first sample from each line is the retrieval sample, and 10 sample images with the closest Hamming distance to the retrieval sample are returned. It can be seen from the sample images returned by the retrieval, the features extracted by the depth hash model can show different categories, the image retrieval based on the hash generated by the depth hash network model has better accuracy from the viewpoint of objective classification, but the similarity between the result returned by the subjective angle analysis and the retrieved samples is general, and the retrieved images are biased to the same type of samples theoretically, as shown in fig. 5.
In the experiment, mainly for rearrangement of comparison and verification deep hash return results, a corresponding hash code is generated for a CIFAR-10 data set through a deep hash network model based on triples, and then a retrieval result is returned by using a feature weight-based rearrangement algorithm. In the experiment, a key parameter exists, and the result with the distance smaller in the hamming space is set to be 2, which means that the returned result is rearranged and returned by using the rearrangement algorithm based on the feature weight value within the range of the hamming distance smaller than 2. As shown in the figure, the rearranged result has the same characteristics with the retrieval sample which are more obvious by visual angle analysis, and the returned result is more reasonable. The algorithm has the obvious characteristic of distinguishing the returned results with different Hamming distances and hash codes, so that the returned results after rearrangement have the same returned results as the results which are not rearranged. By comparing TOP-K results directly returned by Hamming distance with returned results after rearrangement, the number of similar samples in the first 10 returned results is increased, and meanwhile, the accuracy is improved. The comparison results show that the similarity is better in subjective vision, as shown in fig. 6.
After subjective judgment, the accuracy of the returned results of different K values in the TOP-K is different, and the rules can be summarized by comparing the accuracy of the different K values through experiments. The smaller the K value result of TOP-K is, the higher the accuracy rate after rearrangement is, and the difference between the accurate value after rearrangement and the accurate value before rearrangement is gradually reduced until the same value along with the gradual increase of the K value. On the other hand, the similarity comparison that the returned result with the same hash code can be distinguished from the retrieval sample based on the quantitative Hash rearrangement algorithm is verified, and the accuracy rate changes as shown in FIG. 7.
By completing a comparison experiment of the deep hash network structure and the rearrangement algorithm, the deep hash network structure provided by the invention is verified to have better superiority, and the rearrangement algorithm has better visual effect on image retrieval based on the hash code. In the conventional research of the hash function, the similarity is generally compared by comparing the Hamming distance modes, and when the data scale is larger, the Hamming distance distinguishing degree is insufficient.

Claims (5)

1. The image data fast retrieval method based on Hash learning is characterized by comprising the following steps:
step 1, establishing a deep hash model:
the deep hash model comprises five convolution-pooling layers, two full-connection layers, a characteristic layer, a hash layer and an output layer;
step 2, training a deep Hash model:
the training data is a series of labeled data sets { (p)1,w1),(p2,w2),(p3,w3),...(pn,wn) In which p isiAs a sample image, wiIs a label corresponding to the image sample;
the input is a triplet label { p }i,pj,pkIn which p isiAnd pjAre of the same class, piAnd pkAre of different classes, the similarity distance between the same classes is less than the similarity between the different classes;
after a trained deep hash model is obtained, establishing a sample library by using the deep hash model, wherein the sample library consists of image samples and corresponding hash codes;
step 3, aiming at the query image, generating a hash code of the query image by using the trained deep hash model;
and 4, searching by utilizing the hash code of the query image and the image sample library, wherein the specific process is as follows:
let the hash code corresponding to the image in the sample library be pi={hi,1,hi,2,hi,3,…,hi,mH, the hash code corresponding to the query image is pquery={hquery,1,hquery,2,hquery,3,…,hquery,mAnd (4) in the Hamming space, the neighbor of the query image is represented as NN (p)query,)={p|||pquery-pi||2<};
Through | pquery-pi||2< get the neighbor set p of the query sample;
counting bits S ═ S with larger proportion of '0' or '1' in each bit of all hash codes in the nearest neighbor query sample set p1,S2,S3,…,Sm},Si∈{0,1};
Counting the probability P (S) of 0 in each bit of all hash codes in the nearest neighbor query sample set P0)={P(S1,0),P(S2,0),P(S3,0),…,P(Sm,0) In which P (S)i,0)∈[0,1];
P(Si,0)=∑(Si=0,NN(pquery,))/count(NN(pqueryB) for each bit in the hash code, count is the number of "0" in a bit that satisfies the hash code;
counting the probability P (S) of 1 in each bit of all hash codes in the nearest neighbor query sample set P1)={P(S1,1),P(S2,1),P(S3,1),…,P(Sm,1) In which P (S)i,1)∈[0,1];
P(Si,1)=∑(Si=1,NN(pquery,))/count(NN(pqueryB) for each bit in the hash code, count is the number of "1" in a bit that satisfies the hash code;
through omegai=1+max(P(Si,0),P(Si,1) M determines the weight omega of each bit of the hash code { omega ═ omega123,…,ωm};
Then, the weight of each bit of the hash code is utilized to calculate the hash code p corresponding to the image in the sample libraryiHash code p corresponding to query imagequeryWeighted hamming distance of
Figure FDA0002734932560000021
The images and order of querying of the query image into the database is determined by weighting the hamming distances.
2. The hash learning-based image data fast retrieval method according to claim 1, wherein 2 out of the neighbors.
3. The hash learning-based image data fast retrieval method according to claim 2, wherein the feature layer length is 4 times the hash code length.
4. The hash learning-based image data fast retrieval method according to claim 1, 2 or 3, wherein the hash layer uses a pair constraint as a constraint condition, and the stage inputs feature vectors of the feature layer, namely { p ] of the pair constrainti,pj,wij};wijWhen 1 indicates that the samples represented by the two eigenvectors are homogeneous, wijWhen 0, the samples represented by the two feature vectors are not classified; the feature vector generated by the feature layer is Fi,Fj∈RdThe output mapped to the Hash space is bi, bj belongs to { -1,1}mThen distH(bi,bj) Hamming space between bi and bj; the loss function is as follows:
Figure FDA0002734932560000022
after the characteristic vector of the sample image passes through the hash layer, the characteristic vector passes through a tanh function before generating a hash code, and the last hash code is b which is the final hash code after a relaxation variable passesiAnd bjThe value before relaxation is uiAnd uj,ui,uj∈Rm(ii) a Using the pre-relaxation variable u in the calculation of the loss functioniAnd ujIn place of the hash code biAnd bjThe loss function is:
Figure FDA0002734932560000023
in the formula, m is the length of the hash code, and α is the preference norm weight.
5. The hash learning-based image data fast retrieval method according to claim 4, wherein the training of the deep hash model is performed on-line by using small-scale data, and the small-scale triples are created according to the following rules: (1) determining the number of selected samples of different labels from the small batch, and selecting the least number of label samples; (2) randomly shuffling a certain label, and selecting i and i +1 in a sample as anchors p of the triplesiAnd positive example pj(ii) a (3) Randomly selecting other label samples i as triples piNegative example p ofk(ii) a (4) All tags and all samples are cycled through, generating a random combination containing anchors, positive examples, and negative examples.
CN201910415146.5A 2019-05-17 2019-05-17 Image data quick retrieval method based on Hash learning Expired - Fee Related CN110134803B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910415146.5A CN110134803B (en) 2019-05-17 2019-05-17 Image data quick retrieval method based on Hash learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910415146.5A CN110134803B (en) 2019-05-17 2019-05-17 Image data quick retrieval method based on Hash learning

Publications (2)

Publication Number Publication Date
CN110134803A CN110134803A (en) 2019-08-16
CN110134803B true CN110134803B (en) 2020-12-11

Family

ID=67571194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910415146.5A Expired - Fee Related CN110134803B (en) 2019-05-17 2019-05-17 Image data quick retrieval method based on Hash learning

Country Status (1)

Country Link
CN (1) CN110134803B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111522903A (en) * 2020-04-01 2020-08-11 济南浪潮高新科技投资发展有限公司 Deep hash retrieval method, equipment and medium
CN111612077A (en) * 2020-05-22 2020-09-01 深圳前海微众银行股份有限公司 Feature importance visualization method, device and readable storage medium
CN111626408B (en) * 2020-05-22 2021-08-06 深圳前海微众银行股份有限公司 Hash coding method, device and equipment and readable storage medium
CN111625258B (en) * 2020-05-22 2021-08-27 深圳前海微众银行股份有限公司 Mercker tree updating method, device, equipment and readable storage medium
CN111612159A (en) * 2020-05-22 2020-09-01 深圳前海微众银行股份有限公司 Feature importance measuring method, device and readable storage medium
CN113127661B (en) * 2021-04-06 2023-09-12 中国科学院计算技术研究所 Multi-supervision medical image retrieval method and system based on cyclic query expansion
CN112800260B (en) * 2021-04-09 2021-08-20 北京邮电大学 Multi-label image retrieval method and device based on deep hash energy model
CN113190699B (en) * 2021-05-14 2023-04-18 华中科技大学 Remote sensing image retrieval method and device based on category-level semantic hash

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7991778B2 (en) * 2005-08-23 2011-08-02 Ricoh Co., Ltd. Triggering actions with captured input in a mixed media environment
CN103646080A (en) * 2013-12-12 2014-03-19 北京京东尚科信息技术有限公司 Microblog duplication-eliminating method and system based on reverse-order index
US8838591B2 (en) * 2005-08-23 2014-09-16 Ricoh Co., Ltd. Embedding hot spots in electronic documents
CN104317902A (en) * 2014-10-24 2015-01-28 西安电子科技大学 Image retrieval method based on local locality preserving iterative quantization hash
CN104978729A (en) * 2014-04-08 2015-10-14 华中科技大学 Image hashing method based on data sensing
CN106355608A (en) * 2016-09-09 2017-01-25 南京信息工程大学 Stereoscopic matching method on basis of variable-weight cost computation and S-census transformation
CN106484782A (en) * 2016-09-18 2017-03-08 重庆邮电大学 A kind of large-scale medical image retrieval based on the study of multinuclear Hash
JP2018028899A (en) * 2016-08-19 2018-02-22 三菱電機株式会社 Image registration method and system

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7702673B2 (en) * 2004-10-01 2010-04-20 Ricoh Co., Ltd. System and methods for creation and use of a mixed media environment
CN106354735A (en) * 2015-07-22 2017-01-25 杭州海康威视数字技术股份有限公司 Image target searching method and device
CN105469096B (en) * 2015-11-18 2018-09-25 南京大学 A kind of characteristic bag image search method based on Hash binary-coding
CN105512273A (en) * 2015-12-03 2016-04-20 中山大学 Image retrieval method based on variable-length depth hash learning
CN106682233B (en) * 2017-01-16 2020-03-10 华侨大学 Hash image retrieval method based on deep learning and local feature fusion
CN106777388B (en) * 2017-02-20 2020-11-24 华南理工大学 Double-compensation multi-table Hash image retrieval method
CN109472282B (en) * 2018-09-10 2022-05-06 中山大学 Depth image hashing method based on few training samples

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7991778B2 (en) * 2005-08-23 2011-08-02 Ricoh Co., Ltd. Triggering actions with captured input in a mixed media environment
US8838591B2 (en) * 2005-08-23 2014-09-16 Ricoh Co., Ltd. Embedding hot spots in electronic documents
CN103646080A (en) * 2013-12-12 2014-03-19 北京京东尚科信息技术有限公司 Microblog duplication-eliminating method and system based on reverse-order index
CN104978729A (en) * 2014-04-08 2015-10-14 华中科技大学 Image hashing method based on data sensing
CN104317902A (en) * 2014-10-24 2015-01-28 西安电子科技大学 Image retrieval method based on local locality preserving iterative quantization hash
JP2018028899A (en) * 2016-08-19 2018-02-22 三菱電機株式会社 Image registration method and system
CN106355608A (en) * 2016-09-09 2017-01-25 南京信息工程大学 Stereoscopic matching method on basis of variable-weight cost computation and S-census transformation
CN106484782A (en) * 2016-09-18 2017-03-08 重庆邮电大学 A kind of large-scale medical image retrieval based on the study of multinuclear Hash

Also Published As

Publication number Publication date
CN110134803A (en) 2019-08-16

Similar Documents

Publication Publication Date Title
CN110134803B (en) Image data quick retrieval method based on Hash learning
Barz et al. Hierarchy-based image embeddings for semantic image retrieval
CN107480261B (en) Fine-grained face image fast retrieval method based on deep learning
Xiang et al. Fabric image retrieval system using hierarchical search based on deep convolutional neural network
CN113095442B (en) Hail identification method based on semi-supervised learning under multi-dimensional radar data
CN111639544A (en) Expression recognition method based on multi-branch cross-connection convolutional neural network
CN112613552B (en) Convolutional neural network emotion image classification method combined with emotion type attention loss
CN111553127A (en) Multi-label text data feature selection method and device
CN108446334B (en) Image retrieval method based on content for unsupervised countermeasure training
Lee et al. Learnable dynamic temporal pooling for time series classification
CN114386534A (en) Image augmentation model training method and image classification method based on variational self-encoder and countermeasure generation network
CN109886334A (en) A kind of shared nearest neighbor density peak clustering method of secret protection
CN112732921A (en) False user comment detection method and system
Zeng et al. Pyramid hybrid pooling quantization for efficient fine-grained image retrieval
CN113591529A (en) Action segmentation model processing method and device, computer equipment and storage medium
Liu et al. Learning multiple gaussian prototypes for open-set recognition
Zhang et al. MetaDT: Meta decision tree with class hierarchy for interpretable few-shot learning
Alkanat et al. Enabling open-set person re-identification for real-world scenarios
Wang et al. Prototype-based intent perception
Shen et al. Equiangular basis vectors
Qin et al. Deep neighborhood structure-preserving hashing for large-scale image retrieval
Kumar et al. Predictive analytics on gender classification using machine learning
Sadeghi et al. Deep multirepresentation learning for data clustering
Arulmozhi et al. DSHPoolF: deep supervised hashing based on selective pool feature map for image retrieval
CN116452241B (en) User loss probability calculation method based on multi-mode fusion neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20201211

Termination date: 20210517

CF01 Termination of patent right due to non-payment of annual fee