CN109977250A

CN109977250A - Merge the depth hashing image search method of semantic information and multistage similitude

Info

Publication number: CN109977250A
Application number: CN201910211486.6A
Authority: CN
Inventors: 冯永; 沈一鸣; 尚家兴; 强保华
Original assignee: Chongqing University; Guilin University of Electronic Technology
Current assignee: Chongqing University; Guilin University of Electronic Technology
Priority date: 2019-03-20
Filing date: 2019-03-20
Publication date: 2019-07-05
Anticipated expiration: 2039-03-20
Also published as: CN109977250B

Abstract

The invention discloses the depth hashing image search methods of a kind of fusion semantic information and multistage similitude, include the following steps；S1 constructs image data base；S2 constructs label vector matrix and semantic vector matrix；S3 constructs similarity matrix；S4, builds depth Hash neural network model, and original image is converted to approximate Hash vector；S5, building have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector；S6 is trained the depth Hash neural network model built；S7 constructs the Hash vector database of image；S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, to find out similar image.The present invention improves the precision of image retrieval by fusion semantic information；And by the lower bound of the Hamming distances between Hash vector corresponding to two similar pictures of constraint, retrieval performance is improved.

Description

Merge the depth hashing image search method of semantic information and multistage similitude

Technical field

The present invention relates to image retrieval technologies fields, and in particular to a kind of depth of fusion semantic information and multistage similitude Hashing image search method.

Background technique

In recent years, with the development of internet, the image data of magnanimity brings huge challenge to image retrieval task. In face of extensive and complicated image data, searching system should will also take into account retrieval while guaranteeing image retrieval quality Efficiency, while the storage validity problem of massive information is also solved, to realize better user experience.Therefore, it studies more excellent Image search method have very high realistic meaning.

It is more commonly used method at present that image retrieval is carried out based on depth Hash technology, it is advantageous that: pass through by Picture is mapped as binary system Hash vector, can use the quick comparative feature of bit arithmetic, improves retrieval rate, while reducing and need to account for Memory space.

In face of more complicated picture, traditional depth hash method exposes obvious disadvantage.On the one hand, it measures different It is excessively coarse when similarity between image.As long as i.e. two pictures share label, it is treated as similar；It is on the contrary then dissimilar.This Sample does not account for the semantic information that more fine-grained similarity grade and picture are contained between picture.On the other hand, right In traditional binary group loss function, only the hamming between the Hash vector with a specified threshold to constrain two similar pictures away from From the upper bound, there is no any constraint to lower bound.This allows for that there is the relative distance between the image of different degrees of similitude can not protect Card also results in the reduction of sequence accuracy in image searching result.

Summary of the invention

It is an object of the invention to overcome the above-mentioned deficiency in the presence of the prior art, provide a kind of fusion semantic information and The depth hashing image search method of multistage similitude, to merge semantic information, it is contemplated that more fine-grained similarity between picture Grade improves the retrieval precision of image；And by using new binary group loss function, construct to similar pictures Hash vector Hamming distances have the loss function of lower bound constrained, improve the accuracy sorted in image searching result.

In order to achieve the above-mentioned object of the invention, the present invention provides following technical schemes:

A kind of depth hashing image search method of fusion semantic information and multistage similitude, comprising the following steps:

S1 constructs image data base；

S2 constructs label vector matrix and semantic vector matrix；

S3 constructs similarity matrix；

S4, builds depth Hash neural network model, and original image is converted to approximate Hash vector；

S5, building have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector；

S6 is trained the depth Hash neural network model built；

S7 constructs the Hash vector database of image；

S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, similar to find out Image.Image is improved based on label vector matrix and semantic vector matrix building similarity matrix to merge semantic information Retrieval precision；And have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector by constructing, improve image The accuracy sorted in search result.

Preferably, it in the step S2, constructs label vector matrix and the step of semantic vector matrix is as follows:

Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix With semantic vector matrix；Label vector matrix L is constructed using label information, wherein L_i,j=0 the i-th picture of expression does not contain J-th of label, L_i,j=1 the i-th picture of expression contains j-th of label；Using natural language processing technique, by every picture Text description is encoded into a vector, constructs picture semantic vector matrix C, wherein Ci indicates the text description of the i-th picture Corresponding vector represents the semantic information of the picture with this vector.

Preferably, in the step S3, construct similarity matrix the step of it is as follows:

S3-1 carries out mutual inner product to label vector, to construct label using the label vector matrix constructed in step S2 Similarity matrix:

In formula (1), S_labelIt is label similarity matrix,For the label similarity between image i and image j, n table Diagram piece number, L are the label vector matrix constructed in S2, L^TFor the transposed matrix of L, L_totalTotal label square between picture Battle array, wherein L_total[i, j] is the label number that picture i and j include in total；

S3-2 calculates the mutual cosine similarity of semantic vector using the semantic vector matrix constructed in step S2, with Construct semantic similarity matrix:

In formula (2), S_semanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n table Diagram piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | C_i| | it is vector C_iMould it is long；

S3-3 constructs similarity matrix using label similarity matrix and semantic similarity matrix are as follows:

In formula (4), S is similarity matrix, s_i,jSimilarity between representative image i and image j, n indicate picture Number, w is weight coefficient.

Preferably, the step S4 the following steps are included:

S4-1 builds AlexNet network model using TensorFlow deep learning Open Framework, and uses ImageNet Data set carries out pre-training to AlexNet network model；

S4-2 is optimized on classical AlexNet model, constructs depth Hash neural network model；

The depth Hash Artificial Neural Network Structures built are as follows:

Include 5 convolutional layers: the first convolutional layer, the second convolutional layer, third convolutional layer, Volume Four lamination and the 5th convolution Layer；

Include 3 full articulamentums: the first full articulamentum, the second full articulamentum and Hash layer.

Build AlexNet network model using TensorFlow deep learning Open Framework, to AlexNet network model into When row pre-training, which is model adaptation backpropagation undated parameter, and centre does not need human intervention adjusting parameter, Pre-training process is simple.

Preferably, the neuron number of the Hash layer is 64.

Preferably, in the step S5, designed loss function are as follows:

In formula (8),Indicate loss function, s_i,jFor the similarity between image i and image j,For image i and figure As the label similarity between j, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β is to adjust surpassing for threshold value lower bound Parameter, σ are the first parameter, and δ is the second parameter, N_bitsIt is the length for generating Hash vector；b_i、b_jRespectively indicate i-th image and The approximate Hash vector of jth image,Indicate b_i、b_jBetween Euclidean distance；γ is weight coefficient；Indicate dimension Degree and b_iThe vector that identical all elements value is 1,Respectively indicate approximate Hash vector b_i、b_jThe complete each element of 1 vector of absolute value vector sum between the sum of difference.Wherein, the threshold value of Lower and upper bounds follows between picture Similarity s_i,jVariation is adaptive, for sharing two pictures of label, Euclidean distance between corresponding approximation Hash vector, It should be between Lower and upper bounds；And for two pictures of not shared label, it can widen as far as possible between corresponding approximate Hash vector Euclidean distance；Improve image retrieval precision.

Preferably, in the step S6, depth Hash neural network model is trained using stochastic gradient descent method；

In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated ginseng of μ ' expression Number, λ indicate the amplitude that μ updates,Indicate loss function,It indicatesGradient about μ.Using stochastic gradient descent method pair Depth Hash neural network model is trained, and updates the parameter in depth Hash neural network model, improves image inspection Suo Jingdu.

Preferably, in the step S7, the depth completed has been trained to breathe out the image input step S6 in image data base Uncommon neural network model, it will obtain approximate Hash vector set B={ b₁,b₂,…b_n, wherein n is image in image data base Number, b_nFor the approximate Hash vector of n-th image；Approximate Hash vector set B is passed through into sign function, is obtained corresponding Binary system Hash vector database collection H={ h₁, h₂... h_n, wherein n is the number of image in image data base, h_nIt indicates The binary system Hash vector of n-th image.Picture feature is indicated with binary system Hash vector, improves the speed of image retrieval Degree.

Compared with prior art, beneficial effects of the present invention:

1, similarity matrix is constructed based on label vector matrix and semantic vector matrix, to merge semantic information, improves figure The retrieval precision of picture；

2, there is the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector by constructing, improve image inspection The accuracy sorted in hitch fruit；

3, the threshold value of Lower and upper bounds follows the similarity s between picture_i,jVariation is adaptive, for sharing two figures of label Piece, the Euclidean distance between corresponding approximation Hash vector, it should between the Lower and upper bounds；And for two of not shared label Picture can widen the Euclidean distance between corresponding approximate Hash vector as far as possible；Improve image retrieval precision；

4, picture feature is indicated with binary system Hash vector, improve the speed of image retrieval.

Detailed description of the invention:

Fig. 1 is the depth hashing image retrieval of the fusion semantic information and multistage similitude of exemplary embodiment of the present 1 The flow chart of method；

Fig. 2 is the structure chart of the depth Hash neural network model of exemplary embodiment of the present 1.

Marked in the figure: the first convolutional layer of 11-, the second convolutional layer of 12-, 13- third convolutional layer, 14- Volume Four lamination, 15- 5th convolutional layer, the maximum pond layer of 21- first, the maximum pond layer of 22- second, 23- third maximum pond layer, 31- first connect entirely Connect layer, the full articulamentum of 32- second, 41- Hash layer.

Specific embodiment

Below with reference to test example and specific embodiment, the present invention is described in further detail.But this should not be understood It is all that this is belonged to based on the technology that the content of present invention is realized for the scope of the above subject matter of the present invention is limited to the following embodiments The range of invention.

Embodiment 1

As shown in Figure 1, the present embodiment provides the depth hashing image retrievals of a kind of fusion semantic information and multistage similitude Method specifically includes step:

S1: building image data base.

Filter out the type of the highest preceding K label of the frequency of occurrences in data set, and the figure comprising this K label Piece, for constructing image data base.

In the present embodiment, the present invention is using COCO data set disclosed in Microsoft, if each image in data set is corresponding Dry tag class (such as tag class behaviour, water, automobile, etc.).The present invention is chosen in data set frequency of occurrence (by more To few arrangement) it arranges preceding K tag class and possesses the image of the tag class, for constructing image data base.For example, The present invention chosen in COCO data set frequency of occurrence arrange preceding 20 tag class and corresponding image, for constructing the present invention Image data base.

S2: building label vector matrix and semantic vector matrix.

Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix With semantic vector matrix.

In the present embodiment, the present invention randomly selects n image and corresponding label, composing training from image data base Collection: T={ t₁,t₂,…,t_n, t_nIndicate n-th image and corresponding label, n >=1；Wherein, t_n={ I_n,L_n},I_nIndicate n-th Open image, L_nFor label vector, indicates the corresponding label of n-th image, be vector.Label vector is constructed with label vector Matrix L, size are n × K, and n indicates the number of image, and K indicates the number of tag class.Wherein L_i,j=0 indicates i-th figure Piece does not contain j-th of label, L_i,j=1 the i-th picture of expression contains j-th of label.In addition, using natural language processing technique, The text description of every picture is encoded into a vector, constructs picture semantic vector matrix C；Wherein, C_iIndicate i-th figure The text of piece describes corresponding vector, and the semantic information of the picture is represented with this vector.In the present embodiment, by the text of every picture This description is encoded into the vector of one 512 dimension.

S3: building similarity matrix.

S3-1: using the label vector matrix constructed in step S2, label similarity matrix is constructed；

In the present embodiment, by all label vector L_nMutual inner product is carried out, and to the knot of any two label vector inner product Fruit, the total number of labels being related to divided by corresponding image content；The result is used to construct label similarity matrix.Label similarity square Battle array size is n × n, and n indicates the number of image.

Label similarity matrix S_labelIt is expressed as following formula:

In formula (1), S_labelIt is label similarity matrix,For the label similarity between image i and image j, n table Diagram piece number, L are the label vector matrix constructed in S2, L^TFor the transposed matrix of L, L_totalTotal label square between picture Battle array, wherein L_total[i, j] is the label number that picture i and j include in total.

S3-2: using the semantic vector matrix constructed in step S2, calculating the mutual cosine similarity of semantic vector, with Construct semantic similarity matrix；

In the present embodiment, semantic similarity matrix size is n × n, and n indicates the number of image.Semantic similarity matrix table It is shown as following formula:

In formula (2), S_semanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n table Diagram piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | C_i| | it is vector C_iMould it is long.

S3-3: utilizing label similarity matrix and semantic similarity matrix, constructs similarity matrix.

In the present embodiment, similarity matrix S label similarity matrix as obtained in step S3-1 and S3-2 and semantic phase It is merged like degree matrix.Its size is n × n, and n indicates the number of image.Similarity matrix S is expressed as following formula:

In formula (4), S is similarity matrix, s_i,jSimilarity between representative image i and image j, n indicate picture Number, w is weight coefficient.W value is 0.5 in the present embodiment.

S4: building depth Hash neural network model, and original image is converted to approximate Hash vector.

S4-1: AlexNet network model is built using TensorFlow deep learning Open Framework, and uses ImageNet Data set carries out pre-training to AlexNet network model.

TensorFlow deep learning Open Framework is the machine learning library of opposite high-order, and user is set in which can be convenient with it Neural network structure is counted, and it supports automatic derivation, user does not need to solve gradient by backpropagation again.CNN (convolution mind Through network) be image recognition sorting algorithm core algorithm, wherein AlexNet network model be existing deep learning in terms of Classical model.AlexNet network model is built using TensorFlow deep learning Open Framework, and uses ImageNet data Collection carries out pre-training to AlexNet network model.Pre-training is carried out to AlexNet network model, i.e., to stochastic parameter AlexNet network carries out classification based training, so that its parameter is learnt to pervasive feature, which is that model adaptation is anti- To undated parameter is propagated, centre does not need human intervention adjusting parameter.

S4-2: building depth Hash neural network model.

The present embodiment optimizes on classical AlexNet model, for constructing depth Hash neural network model, with Improve the retrieval precision of image.

The present invention removes the last one full articulamentum in the AlexNet network model of pre-training, retain remaining structure and Parameter, and a new Hash layer fc is added in network top_h.Wherein, fc_hLayer includes 64 neurons, the activation primitive of this layer It is set as tanh function, so that the value of each neuron output is between [- 1,1] in fch.The depth Hash nerve net built Network model structure is as shown in Figure 2:

Include 5 convolutional layers: the 11, second convolutional layer of the first convolutional layer (conv1) (conv2) 12, third convolutional layer (conv3) 13, Volume Four lamination (conv4) 14 and the 5th convolutional layer (conv5) 15；

Include 3 full articulamentums: the first full articulamentum (fc6) 31, the second full articulamentum (fc7) 32 and Hash layer (fc_h)41。

The input terminal of first convolutional layer (conv1) 11 is for inputting original image, the output of the first convolutional layer (conv1) 11 The input terminal of the first maximum pond of end connection layer 21, the output end of the first maximum pond layer 21 connect the second convolutional layer (conv2) 12 input terminal, the input terminal of the second maximum pond of output end connection layer 22 of the second convolutional layer (conv2) 12, the second maximum pond Change the input terminal of output end connection third convolutional layer (conv3) 13 of layer 22, the output end connection of third convolutional layer (conv3) 13 The output end of the input terminal of Volume Four lamination (conv4) 14, Volume Four lamination (conv4) 14 connects the 5th convolutional layer (conv5) 15 input terminal, the input terminal of the output end connection third maximum pond layer 23 of the 5th convolutional layer (conv5) 15, third maximum pond The output end for changing layer 23 connects the input terminal of the first full articulamentum (fc6) 31, the output end connection of the first full articulamentum (fc6) 31 The output end of the input terminal of second full articulamentum (fc7) 32, the second full articulamentum (fc7) 32 connects Hash layer (fc_h) 41 it is defeated Enter end, Hash layer (fc_h) 41 output end output by approximate Hash vector.

Original image is input in depth Hash neural network model by the present invention, by reflecting for convolutional layer and full articulamentum After penetrating, approximate Hash vector is obtained, the value range of every dimension is [- 1,1] in approximate Hash vector.For example, the present invention exists The image that original size is 227 × 227 × 3 is inputted in the depth Hash neural network model of building, by 5 convolutional layers and 3 The mapping of a full articulamentum will export the approximate Hash vector of one 64 dimension.In addition, the present invention can input multiple images simultaneously, To obtain multiple approximation Hash vector set B={ b₁,b₂,…b_n, b_nIndicate the approximate Hash vector of n-th image.

S5: building has the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector.

S5-1: the similarity between the corresponding approximate Hash vector of picture is calculated.

In the present embodiment, for approximation Hash vector set B={ b obtained in S4₁,b₂,…b_n, with approximation in set Euclidean distance between Hash vectorTo indicate the similarity between two approximate Hash vectors.Wherein, b_i、 b_jRespectively indicate the approximate Hash vector of i-th, j images.

S5-2: it is based on Euclidean distanceConstruct binary group loss function

Binary group loss functionIt is expressed as following formula:

In formula (5),Indicate binary group loss function, s_i,jFor the similarity between image i and image j, For the label similarity between image i and image j, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β is to adjust The hyper parameter of threshold value lower bound, σ are the first parameter, and it is the second parameter that σ, which is 2.5, δ, in the present embodiment, and δ is 1.5 in the present embodiment, N_bitsIt is the length for generating Hash vector, N in the present embodiment_bitsIt is 64.b_i、b_jRespectively indicate i-th image and jth image Approximate Hash vector,Indicate b_i、b_jBetween Euclidean distance.

The meaning of the binary group loss function are as follows: for two pictures of shared label, between corresponding approximation Hash vector Euclidean distance, it should between the Lower and upper bounds.Wherein, the threshold value of Lower and upper bounds follows the similarity s between picture_i,jVariation is certainly It adapts to.For two pictures of not shared label, the Euclidean distance between corresponding approximate Hash vector can be widened as far as possible；Until its Greater than defined threshold value δ * N_bitsWhen, it can not just generate loss.

S5-3: quantization loss is defined for approximate Hash vector

Quantization lossIt is expressed as following formula:

In formula (6),Indicate dimension and b_iThe vector that identical all elements value is 1, Respectively indicate approximate Hash vector b_i、b_jThe complete each element of 1 vector of absolute value vector sum between difference it With.

The meaning of the formula are as follows: the value of the approximate each dimension of Hash vector is closer to 1 or -1, then it is a reasonable Kazakhstan The probability of uncommon vector is higher, and the loss of generation is with regard to smaller.

S5-4: complete loss function is constructed.

In the present embodiment, loss functionBy binary group loss functionIt is lost with quantizationIt merges:

In formula (7), γ indicates the weight coefficient of quantization loss, is set as 1.0 in the present embodiment.By formula (5) and (6) generation Enter (7), complete loss function can be obtainedAs shown in formula (8):

S6: the depth Hash neural network model built is trained.

S6-1: optimization aim is constructed by loss function.

In the present embodiment, the present invention will construct optimization aim using the loss function model of building:It indicates Ask so thatValue minimum when Θ in all parameters value；Wherein, Θ is the parameter in depth Hash neural network model Set,For the loss function model of building.

S6-2: optimization aim is solved using the method for stochastic gradient descent.

In the present embodiment, the present invention solves optimization aim using the method for stochastic gradient descent, i.e., to loss functionIt asks Its gradient about parameter μ, then parameter is updated to the opposite direction of gradient, calculation formula is as follows:

In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated ginseng of μ ' expression Number,Indicate loss function,It indicatesGradient about μ.λ indicates the amplitude (i.e. learning rate) that μ updates, and may be configured as 0.0003。

It is 256 per batch of amount of training data in the present embodiment, the number of iterations is 10000 times.

S7: the Hash vector database of image is constructed.

In the present embodiment, the image input in image data base has been trained the depth Hash neural network completed by the present invention Model, it will obtain approximate Hash vector set B={ b₁,b₂,…b_n, wherein n is the number of image in image data base, b_n For the approximate Hash vector of n-th image；By approximate Hash vector set B, by sign function, (effect of function is: will be greater than Number equal to 0 is transformed into 1, and the number less than 0 is transformed into -1), obtain corresponding binary system Hash vector database collection H={ h₁, h₂... h_n, wherein n is the number of image in image data base, h_nIndicate the binary system Hash vector of n-th image.

S8: the Hash vector of image to be retrieved and the vector of Hash vector lane database are compared, similar to find out Image.

In the present embodiment, image i input to be retrieved has been trained the depth Hash neural network model completed by the present invention In, to obtain corresponding approximate Hash vector b_i；Using sign function, the corresponding binary system Hash of image i is obtained Vector h_i.By Hash vector h_iThe progress of all Hash vectors and operation in the image Hash vector database of building, obtain respectively To corresponding as a result, and by end value by arranging from big to small；End value is bigger, shows the Hash vector and h_iPhase It is higher like spending, that is, show that the corresponding image of the Hash vector is more similar to the image i that need to be retrieved, has ensured the retrieval essence of image Degree.For example, Hash vector h_iThe progress of first Hash vector and operation with image Hash vector database, obtain the first knot Fruit value；Second Hash vector in Hash vector hi and image Hash vector database carries out and operation, obtains the second result Value；If the first end value is greater than the second end value, show the corresponding image of the first Hash vector and Hash vector h_iIt is corresponding Image is even more like.

The above, the only detailed description of the specific embodiment of the invention, rather than limitation of the present invention.The relevant technologies The technical staff in field is not in the case where departing from principle and range of the invention, various replacements, modification and the improvement made It should all be included in the protection scope of the present invention.

Claims

1. the depth hashing image search method of a kind of fusion semantic information and multistage similitude, which is characterized in that including following Step:

S1 constructs image data base；

S2 constructs label vector matrix and semantic vector matrix；

S3 constructs similarity matrix；

S6 is trained the depth Hash neural network model built；

S7 constructs the Hash vector database of image；

S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, to find out similar figure Picture.

2. the depth hashing image search method of fusion semantic information and multistage similitude as described in claim 1, feature It is, in the step S2, constructs label vector matrix and the step of semantic vector matrix is as follows:

Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix and language Adopted vector matrix；Label vector matrix L is constructed using label information, wherein L_i,j=0 indicates that the i-th picture does not contain j-th Label, L_i,j=1 the i-th picture of expression contains j-th of label；Using natural language processing technique, the text of every picture is retouched It states and is encoded into a vector, construct picture semantic vector matrix C, wherein it is corresponding that Ci indicates that the text of the i-th picture describes Vector represents the semantic information of the picture with this vector.

3. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 2, feature Be, in the step S3, construct similarity matrix the step of it is as follows:

S3-1 carries out mutual inner product to label vector using the label vector matrix constructed in step S2, similar to construct label Spend matrix:

In formula (1), S_labelIt is label similarity matrix,For the label similarity between image i and image j, n indicates figure Piece number, L are the label vector matrix constructed in S2, L^TFor the transposed matrix of L, L_totalTotal label matrix between picture, Middle L_total[i, j] is the label number that picture i and j include in total；

S3-2 calculates the mutual cosine similarity of semantic vector, using the semantic vector matrix constructed in step S2 with building Semantic similarity matrix:

In formula (2), S_semanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n indicates figure Piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | C_i| | it is vector C_iMould it is long；

In formula (4), S is similarity matrix, s_i,jSimilarity between representative image i and image j, n indicate picture number, and w is Weight coefficient.

4. the depth hashing image search method of fusion semantic information and multistage similitude as described in claim 1, feature Be, the step S4 the following steps are included:

S4-1 builds AlexNet network model using TensorFlow deep learning Open Framework, and uses ImageNet data Collection carries out pre-training to AlexNet network model；

The depth Hash Artificial Neural Network Structures built are as follows:

Include 5 convolutional layers: the first convolutional layer, the second convolutional layer, third convolutional layer, Volume Four lamination and the 5th convolutional layer；

5. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 4, feature It is, the neuron number of the Hash layer is 64.

6. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 3, feature It is, in the step S5, designed loss function are as follows:

In formula (8),Indicate loss function, s_i,jFor the similarity between image i and image j,For image i and image j Between label similarity, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β be the super ginseng of adjusting threshold value lower bound Number, σ are the first parameter, and δ is the second parameter, N_bitsIt is the length for generating Hash vector；b_i、b_jRespectively indicate i-th image and The approximate Hash vector of j images,Indicate b_i、b_jBetween Euclidean distance；γ is weight coefficient；Indicate dimension And b_iThe vector that identical all elements value is 1,Respectively indicate approximate Hash vector b_i、 b_jThe complete each element of 1 vector of absolute value vector sum between the sum of difference.

7. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 6, feature It is, in the step S6, depth Hash neural network model is trained using stochastic gradient descent method；

In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated parameter of μ ' expression, λ Indicate the amplitude that μ updates,Indicate loss function,It indicatesGradient about μ.

8. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 7, feature It is, in the step S7, the image input step S6 in image data base has been trained to the depth Hash neural network completed Model, it will obtain approximate Hash vector set B={ b₁,b₂,…b_n, wherein n is the number of image in image data base, b_n For the approximate Hash vector of n-th image；Approximate Hash vector set B is passed through into sign function, corresponding binary system is obtained and breathes out Uncommon vector data library set H={ h₁, h₂... h_n, wherein n is the number of image in image data base, h_nIndicate n-th image Binary system Hash vector.