CN110134803B

CN110134803B - Image data quick retrieval method based on Hash learning

Info

Publication number: CN110134803B
Application number: CN201910415146.5A
Authority: CN
Inventors: 王红滨; 纪斯佳; 张毅; 周连科; 王念滨; 童鹏鹏; 崔琎
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2019-05-17
Filing date: 2019-05-17
Publication date: 2020-12-11
Anticipated expiration: 2039-05-17
Also published as: CN110134803A

Abstract

A quick retrieval method of image data based on Hash learning relates to a quick retrieval method of image data, and belongs to the technical field of data retrieval. The method aims to solve the problem that the negative feedback process of the model in the training stage is deviated due to the fact that multiple times of relaxation are used in the hash code generation stage of the existing model. The deep hash model comprises five convolution-pooling layers, two full-connection layers, a characteristic layer, a hash layer and an output layer; training based on triple constraints to obtain a trained deep hash model, and then establishing a sample library by using the deep hash model, wherein the sample library consists of image samples and corresponding hash codes; aiming at the query image, generating a hash code of the query image by using a trained deep hash model; and searching by utilizing the hash code of the query image and the image sample library. The invention is suitable for image data retrieval.

Description

Image data quick retrieval method based on Hash learning

Technical Field

The invention relates to a method for quickly retrieving image data, and belongs to the technical field of data retrieval.

Background

In recent years, with the rapid development of the internet, high-dimensional data shows exponential development, and how to utilize the data becomes a focus of various industries. Researchers have proposed many methods for large-scale data retrieval in the past, and the hash method has been widely used with its efficient storage and computation efficiency (LI WuJun, ZHOU ZhiHua. big data hash learning: status and trend [ J ] scientific notice, 2015, 60 (Z1): 485-) 490.). The traditional hash method comprises locality sensitive hash and spectral hash, achieves certain achievement on image retrieval, and still has a certain distance from practical application. The rapid development of deep learning advances the progress of the hash method, and in 2014, a Convolutional Neural Network hash model (CNNH) (r.xia, y.pan, h.lai, et al.superior hash for image temporal image representation learning [ C ]. AAAI reference on intelligent learning, 2014) is proposed by combining panitis and facial water for the first time with a Convolutional Neural Network, and compared with the traditional hash method, a better effect is obtained. CNNH is divided into two stages to train the hash code, the first step is to decompose a similar matrix S, each element in the matrix S indicates whether the sample images of the row and the column of the element are similar, and each row of the matrix H is the approximate hash code of the training data. The image features of the images in the model expressed in the training process cannot be used for generating the hash codes in a reaction mode, the Hamming distance between the hash codes cannot be adjusted dynamically, and the learned hash function is suboptimal due to the fact that the advantages of the convolutional neural network cannot be utilized. On the basis, Li W J, Wang S, Kang W C et al propose a Deep Neural Network Hash model (Deep Neural Network Hash, DNNH), H.Liu, R.Wang, S.Shan et al propose a Deep Supervised Hash model (Deep Supervised Hash, DSH) in Deep Learning based Deep Supervised Hash with Pairwise Labels. The two models adopt an end-to-end model, the problems of CNNH feature extraction and Hash coding separation are essentially overcome, and loss functions are designed from different angles for generating the Hash codes. However, the two models use multiple relaxations in the hash code generation stage, which causes the negative feedback process of the models in the training stage to be biased, and thus, the generated trained models are not accurate enough for image data retrieval.

Disclosure of Invention

The invention provides a fast image data retrieval method based on hash learning, which aims to solve the problem that the negative feedback process of a model in a training stage is deviated due to the fact that the existing model uses multiple times of relaxation in a hash code generation stage.

The invention discloses a quick retrieval method of image data based on Hash learning, which comprises the following steps:

step 1, establishing a deep hash model:

the deep hash model comprises five convolution-pooling layers, two full-connection layers, a characteristic layer, a hash layer and an output layer;

step 2, training a deep Hash model:

the training data is a series of labeled data sets { (p)₁，w₁)，(p₂，w₂)，(p₃，w₃)，...(p_n，w_n) In which p is_iAs a sample image, w_iIs a label corresponding to the image sample;

the input is a triplet label { p }_i，p_j，p_kIn which p is_iAnd p_jAre of the same class, p_iAnd p_kAre of different classes, the similarity distance between the same classes is less than the similarity between the different classes;

after a trained deep hash model is obtained, establishing a sample library by using the deep hash model, wherein the sample library consists of image samples and corresponding hash codes;

step 3, aiming at the query image, generating a hash code of the query image by using the trained deep hash model;

and 4, retrieving by utilizing the hash code of the query image and the image sample library.

Further, the process of searching by using the hash code of the query image and the image sample library includes the following steps:

let the hash code corresponding to the image in the sample library be p_i＝{h_i，1，h_i，2，h_i，3，…，h_i，mH, the hash code corresponding to the query image is p_query＝{h_query，1，h_query，2，h_query，3，…，h_query，mAnd (4) in the Hamming space, the neighbor of the query image is represented as NN (p)_query,)＝{p|||p_query-p_i||₂＜}；

Through | p_query-p_i||₂< get the neighbor set p of the query sample;

statistical nearest neighbor query sample setThe bit S with a larger proportion of '0' or '1' in each bit of all hash codes in the p-sum is ═ S₁,S₂,S₃,…,S_m},S_i∈{0,1}；

Counting the probability P (S) of 0 in each bit of all hash codes in the nearest neighbor query sample set P₀)＝{P(S_1,0),P(S_2,0),P(S_3,0),…,P(S_m,0) In which P (S)_i,0)∈[0,1]；

P(S_i,0)＝∑(S_i＝0,NN(p_query,))/count(NN(p_queryB) for each bit in the hash code, count is the number of "0" in a bit that satisfies the hash code;

counting the probability P (S) of 1 in each bit of all hash codes in the nearest neighbor query sample set P₁)＝{P(S_1,1),P(S_2,1),P(S_3,1),…,P(S_m,1) In which P (S)_i,1)∈[0,1]；

P(S_i,1)＝∑(S_i＝1,NN(p_query,))/count(NN(p_queryB) for each bit in the hash code, count is the number of "1" in a bit that satisfies the hash code;

through omega_i＝1+max(P(S_i,0),P(S_i,1) M determines the weight omega of each bit of the hash code { omega ═ omega₁,ω₂,ω₃,…,ω_m}；

Then, the weight of each bit of the hash code is utilized to calculate the hash code p corresponding to the image in the sample library_iHash code p corresponding to query image_queryWeighted hamming distance of

The images and order of querying the query image into the database is determined by weighting the hamming distances.

Further, 2 in the neighbor.

Further, the length of the characteristic layer is 4 times of the length of the hash code.

Further, the hash layer uses a pair constraint as a constraint condition, and the stage inputs special charactersFeature vectors for feature layers, i.e. input as p for constraints_i,p_j,w_ij}；w_ijWhen 1 indicates that the samples represented by the two eigenvectors are homogeneous, w_ijWhen 0, the samples represented by the two feature vectors are not classified; the feature vector generated by the feature layer is F_i，F_j∈R^dThe output mapped to the Hash space is bi, bj belongs to { -1,1}^mThen dist_H(b_i,b_j) Hamming space between bi and bj; the loss function is as follows:

after the characteristic vector of the sample image passes through the hash layer, the characteristic vector passes through a tanh function before generating a hash code, and the last hash code is b which is the final hash code after a relaxation variable passes_iAnd b_jThe value before relaxation is u_iAnd u_j，u_i，u_j∈R^m(ii) a Using the pre-relaxation variable u in the calculation of the loss function_iAnd u_jIn place of the hash code b_iAnd b_jThe loss function is:

in the formula, m is the length of the hash code, and α is the preference norm weight.

Further, in the process of training the deep hash model, small-scale data is adopted for online training, and the small-scale triples are created according to the following rules: (1) determining the number of selected samples of different labels from the small batch, and selecting the least number of label samples; (2) randomly shuffling a certain label, and selecting i and i +1 in a sample as anchors p of the triples_iAnd positive example p_j(ii) a (3) Randomly selecting other label samples i as triples p_iNegative example p of_k(ii) a (4) All tags and all samples are cycled through, generating a random combination containing anchors, positive examples, and negative examples.

The most prominent characteristics and remarkable beneficial effects of the invention are as follows:

according to the invention, through completing the comparison experiment of the deep hash network structure and the rearrangement algorithm, the deep hash network structure provided by the invention is verified to have better superiority, and the rearrangement algorithm has better visual effect on image retrieval based on hash codes. In the traditional research of the hash function, the similarity is generally compared by comparing Hamming distance modes, and the Hamming distance discrimination is insufficient when the data scale is large. Experiments show that the method based on Hash learning has better performance on CIFAR-10 and NUS-WIDE compared with other methods.

Drawings

FIG. 1 is a deep hash network architecture;

FIG. 2 is a diagram illustrating different classifications of hash codes in a neighborhood set;

FIG. 3 is a comparison graph of CIFAR-10 results;

FIG. 4 is a NUS-WIDE comparison graph;

FIG. 5 is a visual experimental result;

FIG. 6 is a visualization experiment result of the rearrangement algorithm;

FIG. 7 is a 24bit accuracy.

Detailed Description

The first embodiment is as follows:

the image data fast retrieval method based on the hash learning of the embodiment specifically comprises the following steps:

1. establishing a deep hash model:

the deep hash model comprises five convolution-pooling layers, two full-connection layers, a characteristic layer, a hash layer and an output layer; outputting a feature vector with a certain length by the feature layer; the feature vector is then mapped to a hash code by a hash layer. The model structure is shown in fig. 1, and the specific parameters are shown in table 1.

TABLE 1 model parameters

Each fully connected layer structurally consists of a single layer of 500 x 1 neurons and activation functions. The full connection layer is used for connecting each feature of the middle feature layer, and the relation between the extracted features is corresponding to different bits of the hash code through the hash layer, so that the hash code with a large Hamming distance is generated by different sample images.

2. Training a deep hash model:

the training data is a series of labeled data sets { (p)₁，w₁)，(p₂，w₂)，(p₃，w₃)，...(p_n，w_n) In which p is_iSample image, w_iIs a label corresponding to the image sample;

triple tag { p_i，p_j，p_kDenotes a ternary constraint, representing the proximity between samples, p under a certain metric_iAnd p_jIs less than p_iAnd p_kThe distance between them; the ternary constraint has better classification effect in the actual training of the invention and better adaptability to the model. The input is a triplet label { p }_i，p_j，p_kIn which p is_iAnd p_jAre of the same class, p_iAnd p_kAre of different classes, the similarity distance between the same classes is less than the similarity between the different classes;

generally, all training data are selected to be combined, but the training efficiency of the model is very inefficient, and error samples in the training samples can mislead the generation of the model. In order to ensure that the model can be converged to the ternary constraint condition quickly, the invention uses small-scale data to carry out large-batch online training, for example, 40 small-scale sample images are selected each time, and a ternary group is established for the samples. The creation of small-scale triplets follows several rules: (1) from small batchesDetermining the number of selected samples of different tags, and selecting the least number of tag samples; (2) randomly shuffling a certain label, and selecting i and i +1 in a sample as anchors p of the triples_iAnd positive example p_j(ii) a (3) Randomly selecting other label samples i as triples p_iNegative example p of_k(ii) a (4) All tags and all samples are cycled through, generating a random combination containing anchors, positive examples, and negative examples. Under the condition of the rule, the sample data is ensured to be uniformly distributed, and the randomness is increased.

Characteristic layer: the convergence condition of the deep hash network is that the feature vector output by the training data in the feature layer meets the triple constraint condition, and the constraint condition can enable the model to be extracted to have more expressive features. The triple constraint is applied to feature extraction, namely, the Euclidean distance between feature vectors of the same type of samples is smaller than that between different types of samples, and the formula is as follows:

in the formula

The anchor samples are represented as a sample of the anchor,

a sample of a positive case is shown,

representing negative example samples, f is a mapping function obtained by learning (mapping samples from sample images to feature vectors), threshold represents the distance that a particular threshold is used to control the positive and negative samples, and | | · | | represents the euclidean distance between feature vectors. In the formula, when the intra-class distance is smaller than the inter-class distance, the error is 0, when the intra-class distance is not smaller than the inter-class distance, the error exists, and the formula is represented by using a plus sign.

The lower the value of threshold in the training phase, the Loss function Loss^tripletIt is easier to go to 0, the distance between the anchor and the positive example is not too close, the anchor and the negativeThe distance between the examples is not too far, but the resulting model is more difficult to converge at this time. When the threshold is large, the model is close to the distance between the anchor and the positive example, and the distance between the anchor and the negative example is far, so that the Loss function Loss of the model^tripletThe value is kept at a large value, so that a reasonable threshold value is particularly critical for the training of the model. The deep hash network uses a ternary Loss function for constraint at a feature layer, namely, by minimizing Loss^tripl_eAnd t, carrying out negative feedback network, and adjusting parameters in the network to obtain more expressive characteristics.

And (3) Hash layer: using the pair constraint as constraint condition in the hash layer, the input of the stage is the feature vector of the feature layer, i.e. the input is the { p of the pair constraint_i,p_j,w_ij}；w_ijWhen 1 indicates that the samples represented by the two eigenvectors are homogeneous, w_ijWhen 0, it means that the samples represented by the two feature vectors are not in the same class. Feature vector F generated by feature layer_i，F_j∈R^dThe output mapped to the Hash space is bi, bj belongs to { -1,1}^mThen dist_H(b_i,b_j) Hamming space between bi and bj; the loss function is as follows:

wherein m is the length of the hash code;

dividing the loss function by m can control the loss function to be between 0 and 1, regardless of the hash length. If not divided by m, the longer the hash code length is, the greater the loss will be, which can make the result more accurate.

When w is_ijWhen 1, for Loss^pairB is reduced as much as possible when the derivative is lowered in gradient_iAnd b_jHamming distance between them to reduce Loss^pairWhen w is a value of_ijWhen 0, the hamming distance between bi and bj is increased. Using the loss function as a constraint condition to generate hash codes for samples of the same classThe Hamming distance between the samples is short, the Hamming distance between the Hash codes generated by the samples of different classes is long, and the Hash codes obtained by the method are optimal.

Dist in formula_H(b_i，b_j) The function is discretized, and random gradient descent cannot be performed by a traditional method due to the problem that the gradient cannot be guided, namely, the model parameters cannot be adjusted reversely. In order to solve the problem that the loss function cannot be derived, the feature vector of the sample image passes through the tanh function after passing through the hash layer and before generating the hash code, and the tanh has the advantages that the real value is compressed between (-1, 1), and when the value is larger than the gradient value around 0, the values can be distributed around-1 and 1 as much as possible, so that the generation of the hash code is facilitated. Knowing that the last hash code is b that is the last one after the variable is relaxed_iAnd b_jSo that the value before relaxation is u_iAnd u_j，u_i，u_j∈R^m. In order to make the function derivable during training, the pre-relaxation variable u is used in the calculation of the loss function_iAnd u_jIn place of the hash code b_iAnd b_jAnd in order to prevent the overfitting of the model in the training process and improve the generalization capability of the model, a regular term is added after the loss function. The loss function used in the actual training process is:

in the formula, α is a preference norm weight, when α → 0, the model is easy to be over-fitted, and when α → ∞, the model is under-fitted, so that an appropriate α value is also important for training the model.

3. aiming at the query image, generating a hash code of the query image by using a trained deep hash model;

4. and searching by utilizing the hash code of the query image and the image sample library.

The second embodiment is as follows:

the process of searching by using the hash code of the query image and the image sample library in the embodiment comprises the following steps:

the hash function obtained by the deep hash model can enable each sample image in the sample library to have a unique hash code h₁,h₂,…,h_m},h_iE {0,1 }. When similar images of the query sample q are to be retrieved, the hamming distance calculation formula with the images in the sample library is:

in the formula, dist_H(h_i，h_j) Is the hamming distance and m is the length of the hash code. It can be known from the formula that each bit in the hash code has the same effect, and in the process of generating the hash code, each bit of the hash code is a single feature or a combination of a plurality of features, and is ignored when searching by using the hamming distance. In addition to the inability to perform expressive features, in searching for images, the search results for the same hamming distance cannot be further divided, making the search results less accurate. Therefore, the invention assigns each hash code with a specific feature weight ω_iWhen the Hamming distance is calculated, the weighted Hamming distance is used for calculation, so that the similarity between the query sample and the data of the sample library can be refined, and the retrieval return result has higher similarity with the query sample. In this embodiment, each bit of the hash code may be given a specific weight, and it is assumed that the weight of each bit of the hash code in a certain class is ω ═ ω { (ω })₁,ω₂,ω₃,…,ω_mAnd then the weighted hamming distance is defined as follows:

compared with the dispersion of the Hamming distance, the weighted Hamming distance has smaller similarity measurement granularity, and the similarity between the same Hamming distance can be further divided. The invention provides a brand new weighting method, and the design of the weight of each bit of the hash code is introduced in detail in the subsequent process.

Let the hash code corresponding to the image in the sample library be p_i＝{h_i，1，h_i，2，h_i，3，…，h_i，mH, the hash code corresponding to the query image is p_query＝{h_query，1，h_query，2，h_query，3，…，h_query，mAnd then in Hamming space, the neighbors of the query image are represented as

NN(p_query,)＝{p|||p_query-p_i||₂＜}

Since the hamming distance is simple and efficient for image retrieval, the simplicity and the efficiency of the hamming distance weight are kept when the hamming distance weight is designed. The feature weight proposed by the present invention is based on neighbors in hamming space, a set p of neighbor samples is retrieved by hamming distance before weighted hamming distance calculation, the hamming distance between the hash code of all samples in the set p and the hash code of the query sample is smaller than that, but the hash codes in the set are different, as shown in fig. 2. How to determine the weight of the hash code with different bits is to count all samples in the set p, count the probabilities of "0" and "1" on each bit, and then calculate the weighted hamming distance between the query sample and the sample library on the sample set by using the probability.

For all the hash codes generated by the set of sample data, let P (1)_iFor the probability that the ith bit of the hash code is '1', P (0)_iFor the probability that the ith bit of the hash code is "0", the following relationship exists:

P("1")_i+P("0")_i＝1

from this relationship, it is clear that the feature aggregation is more pronounced in the sample set p, and most hash codes in the samples are biased to be definite in a certain bit. For example, when a cat sample is mainly identified by "ears", when cat data in the sample library has ear characteristics, the cat data has high consistency in certain code bit expression. When weight is designed, this location is more important than other locations. When the weight of each bit of the hash code is calculated, the weight is updated according to the importance degree of the hash code.

The weight ω is calculated as follows.

(2) The process acts to characterize bit. The greater the separation degree of a certain bit in all the hash codes, the stronger the bit feature expression. For example, now there are 10 hash codes of length 12, for the first bit of all the hash codes there are 9 '1's and 1 '0's, and the second bit has 5 '1's and 5 '0's, then the first bit will be weighted more heavily than the second bit.

The calculation process of the weight omega shows that the weight is mainly used for distinguishing samples with the same Hamming distance, and the relation between the Hamming distance and the similarity is fundamentally kept. The relationship between hamming distance and weighted hamming distance is as follows:

in the formula, the first step is that,

for weighting the Hamming distance, the weighted Hamming distance further refines the division rule on the basis of not destroying the efficiency of the Hamming distance, and overcomes the problem of the same distance ordering to a certain extent.

Examples

Experiments are carried out by utilizing CIFAR-10(A. Krizovsky, G.Hinton. learning Multiple Layers of Features from Tiny Images [ J ].2012.) and NUS-WIDE (Zhang P, Zhang W, Li W J, et al. Supervised bathing with patient factor [ M ].2014.) data sets, and the effective reliability of experimental comparison is ensured. Experiment 600 image samples are extracted from each class in the CIFAR-10 dataset as experimental data, wherein 500 image samples are used as training data, and the other 100 image samples are used as test data. Since the NUS-WIDE dataset is a multi-label dataset, two sample images are considered to be sample data of the same type if they have one identical label. In the experiment, the average mAP of the first 5000 returned samples was taken as the final comparison data using the same calculation method as the others. From the results, it can be seen that FastH, CNNH, NINH combined with deep neural networks have better accuracy than the conventional methods. In CNNH, the hash code used for fitting through the deep neural network is suboptimal compared to hash codes obtained by other hash learning methods. Compared with experiments, the deep hash method provided by the subject has a better experimental effect, and the metric mAP of the data is higher and higher along with the increase of the length of the hash code. As shown in table 2, the deep hash model proposed by the present invention is improved to some extent compared with other methods. Compared with the traditional hash method, the method has obvious improvement effect on LSH, SH and ITQ. Compared with other hash learning methods, such as FastH, CNNH and NINH, the method is improved on CIFAR-10 data sets and NUS-WIDE data sets, and the excellent performance of the deep hash model on hash coding is verified.

TABLE 2 comparison of dataset retrieval accuracy (mAP) results

As can be seen from table 2, compared with the experiment results of the deep hash network model proposed by the present subject, the data set of CIFAR-10 is improved significantly, and the improvement is 3.8%, 3.5%, 5.0% and 5.1% respectively in different bit hash codes. The different bit hash code improvements in the NUS-WIDE dataset are 5%, 6.8%, 5.4% and 6.8%, respectively. Through comparison experiments, the hash codes with different lengths in different data sets are improved to a certain extent.

The characteristic extraction uses a ternary loss function to extract the characteristics of the image, the length of the extracted characteristics in the experiment is also a key factor influencing the generation of the hash code, the characteristics are easy to be over-fitted in a hash layer due to short length, and interference characteristics can be extracted due to long characteristic length to influence the generation of the hash code. In order to obtain the optimal value of the characteristic layer length, the influence of different characteristic lengths on the final mAP result is compared. The experiment lengths selected in the experiment are related to the lengths of the finally generated hash codes, the lengths of 'L', '2X L', '3X L', '4X L' and '5X L' are respectively selected in the experiment to be compared, wherein 'L' is the length of the finally generated hash codes, and broken lines in a comparison graph respectively represent hash code results with different lengths.

Analysis of the line graphs (fig. 3 and 4) of the two data sets can be performed to obtain the hash code directly through processing at the layer when the length of the characteristic layer is 1 times of the hash code, but the result is general. With the increase of the length of the characteristic layer, when the length of the characteristic layer is 4 times of the length of the hash code, the effect is better, and a part of data mAP can be slightly reduced when the length of the characteristic layer is increased. So the test is the optimal characteristic layer length in the experiment.

In the visualization experiment, image retrieval is mainly carried out through a CIFAR-10 data set, the data set is a single-label data set, each sample image contains less information, certain characteristics can be more accurately represented, and retrieval return results are more visually displayed. The experimental principle is to return TOP-K samples with the minimum Hamming distance from the hash code of the retrieval sample, the first sample from each line is the retrieval sample, and 10 sample images with the closest Hamming distance to the retrieval sample are returned. It can be seen from the sample images returned by the retrieval, the features extracted by the depth hash model can show different categories, the image retrieval based on the hash generated by the depth hash network model has better accuracy from the viewpoint of objective classification, but the similarity between the result returned by the subjective angle analysis and the retrieved samples is general, and the retrieved images are biased to the same type of samples theoretically, as shown in fig. 5.

In the experiment, mainly for rearrangement of comparison and verification deep hash return results, a corresponding hash code is generated for a CIFAR-10 data set through a deep hash network model based on triples, and then a retrieval result is returned by using a feature weight-based rearrangement algorithm. In the experiment, a key parameter exists, and the result with the distance smaller in the hamming space is set to be 2, which means that the returned result is rearranged and returned by using the rearrangement algorithm based on the feature weight value within the range of the hamming distance smaller than 2. As shown in the figure, the rearranged result has the same characteristics with the retrieval sample which are more obvious by visual angle analysis, and the returned result is more reasonable. The algorithm has the obvious characteristic of distinguishing the returned results with different Hamming distances and hash codes, so that the returned results after rearrangement have the same returned results as the results which are not rearranged. By comparing TOP-K results directly returned by Hamming distance with returned results after rearrangement, the number of similar samples in the first 10 returned results is increased, and meanwhile, the accuracy is improved. The comparison results show that the similarity is better in subjective vision, as shown in fig. 6.

After subjective judgment, the accuracy of the returned results of different K values in the TOP-K is different, and the rules can be summarized by comparing the accuracy of the different K values through experiments. The smaller the K value result of TOP-K is, the higher the accuracy rate after rearrangement is, and the difference between the accurate value after rearrangement and the accurate value before rearrangement is gradually reduced until the same value along with the gradual increase of the K value. On the other hand, the similarity comparison that the returned result with the same hash code can be distinguished from the retrieval sample based on the quantitative Hash rearrangement algorithm is verified, and the accuracy rate changes as shown in FIG. 7.

By completing a comparison experiment of the deep hash network structure and the rearrangement algorithm, the deep hash network structure provided by the invention is verified to have better superiority, and the rearrangement algorithm has better visual effect on image retrieval based on the hash code. In the conventional research of the hash function, the similarity is generally compared by comparing the Hamming distance modes, and when the data scale is larger, the Hamming distance distinguishing degree is insufficient.

Claims

1. The image data fast retrieval method based on Hash learning is characterized by comprising the following steps:

step 1, establishing a deep hash model:

step 2, training a deep Hash model:

and 4, searching by utilizing the hash code of the query image and the image sample library, wherein the specific process is as follows:

Through | p_query-p_i||₂< get the neighbor set p of the query sample;

counting bits S ═ S with larger proportion of '0' or '1' in each bit of all hash codes in the nearest neighbor query sample set p₁,S₂,S₃,…,S_m},S_i∈{0,1}；

The images and order of querying of the query image into the database is determined by weighting the hamming distances.

2. The hash learning-based image data fast retrieval method according to claim 1, wherein 2 out of the neighbors.

3. The hash learning-based image data fast retrieval method according to claim 2, wherein the feature layer length is 4 times the hash code length.

4. The hash learning-based image data fast retrieval method according to claim 1, 2 or 3, wherein the hash layer uses a pair constraint as a constraint condition, and the stage inputs feature vectors of the feature layer, namely { p ] of the pair constraint_i,p_j,w_ij}；w_ijWhen 1 indicates that the samples represented by the two eigenvectors are homogeneous, w_ijWhen 0, the samples represented by the two feature vectors are not classified; the feature vector generated by the feature layer is F_i，F_j∈R^dThe output mapped to the Hash space is bi, bj belongs to { -1,1}^mThen dist_H(b_i,b_j) Hamming space between bi and bj; the loss function is as follows:

5. The hash learning-based image data fast retrieval method according to claim 4, wherein the training of the deep hash model is performed on-line by using small-scale data, and the small-scale triples are created according to the following rules: (1) determining the number of selected samples of different labels from the small batch, and selecting the least number of label samples; (2) randomly shuffling a certain label, and selecting i and i +1 in a sample as anchors p of the triples_iAnd positive example p_j(ii) a (3) Randomly selecting other label samples i as triples p_iNegative example p of_k(ii) a (4) All tags and all samples are cycled through, generating a random combination containing anchors, positive examples, and negative examples.