CN109977250A - Merge the depth hashing image search method of semantic information and multistage similitude - Google Patents

Merge the depth hashing image search method of semantic information and multistage similitude Download PDF

Info

Publication number
CN109977250A
CN109977250A CN201910211486.6A CN201910211486A CN109977250A CN 109977250 A CN109977250 A CN 109977250A CN 201910211486 A CN201910211486 A CN 201910211486A CN 109977250 A CN109977250 A CN 109977250A
Authority
CN
China
Prior art keywords
vector
image
hash
label
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910211486.6A
Other languages
Chinese (zh)
Other versions
CN109977250B (en
Inventor
冯永
沈一鸣
尚家兴
强保华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Guilin University of Electronic Technology
Original Assignee
Chongqing University
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University, Guilin University of Electronic Technology filed Critical Chongqing University
Priority to CN201910211486.6A priority Critical patent/CN109977250B/en
Publication of CN109977250A publication Critical patent/CN109977250A/en
Application granted granted Critical
Publication of CN109977250B publication Critical patent/CN109977250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses the depth hashing image search methods of a kind of fusion semantic information and multistage similitude, include the following steps;S1 constructs image data base;S2 constructs label vector matrix and semantic vector matrix;S3 constructs similarity matrix;S4, builds depth Hash neural network model, and original image is converted to approximate Hash vector;S5, building have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector;S6 is trained the depth Hash neural network model built;S7 constructs the Hash vector database of image;S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, to find out similar image.The present invention improves the precision of image retrieval by fusion semantic information;And by the lower bound of the Hamming distances between Hash vector corresponding to two similar pictures of constraint, retrieval performance is improved.

Description

Merge the depth hashing image search method of semantic information and multistage similitude
Technical field
The present invention relates to image retrieval technologies fields, and in particular to a kind of depth of fusion semantic information and multistage similitude Hashing image search method.
Background technique
In recent years, with the development of internet, the image data of magnanimity brings huge challenge to image retrieval task. In face of extensive and complicated image data, searching system should will also take into account retrieval while guaranteeing image retrieval quality Efficiency, while the storage validity problem of massive information is also solved, to realize better user experience.Therefore, it studies more excellent Image search method have very high realistic meaning.
It is more commonly used method at present that image retrieval is carried out based on depth Hash technology, it is advantageous that: pass through by Picture is mapped as binary system Hash vector, can use the quick comparative feature of bit arithmetic, improves retrieval rate, while reducing and need to account for Memory space.
In face of more complicated picture, traditional depth hash method exposes obvious disadvantage.On the one hand, it measures different It is excessively coarse when similarity between image.As long as i.e. two pictures share label, it is treated as similar;It is on the contrary then dissimilar.This Sample does not account for the semantic information that more fine-grained similarity grade and picture are contained between picture.On the other hand, right In traditional binary group loss function, only the hamming between the Hash vector with a specified threshold to constrain two similar pictures away from From the upper bound, there is no any constraint to lower bound.This allows for that there is the relative distance between the image of different degrees of similitude can not protect Card also results in the reduction of sequence accuracy in image searching result.
Summary of the invention
It is an object of the invention to overcome the above-mentioned deficiency in the presence of the prior art, provide a kind of fusion semantic information and The depth hashing image search method of multistage similitude, to merge semantic information, it is contemplated that more fine-grained similarity between picture Grade improves the retrieval precision of image;And by using new binary group loss function, construct to similar pictures Hash vector Hamming distances have the loss function of lower bound constrained, improve the accuracy sorted in image searching result.
In order to achieve the above-mentioned object of the invention, the present invention provides following technical schemes:
A kind of depth hashing image search method of fusion semantic information and multistage similitude, comprising the following steps:
S1 constructs image data base;
S2 constructs label vector matrix and semantic vector matrix;
S3 constructs similarity matrix;
S4, builds depth Hash neural network model, and original image is converted to approximate Hash vector;
S5, building have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector;
S6 is trained the depth Hash neural network model built;
S7 constructs the Hash vector database of image;
S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, similar to find out Image.Image is improved based on label vector matrix and semantic vector matrix building similarity matrix to merge semantic information Retrieval precision;And have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector by constructing, improve image The accuracy sorted in search result.
Preferably, it in the step S2, constructs label vector matrix and the step of semantic vector matrix is as follows:
Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix With semantic vector matrix;Label vector matrix L is constructed using label information, wherein Li,j=0 the i-th picture of expression does not contain J-th of label, Li,j=1 the i-th picture of expression contains j-th of label;Using natural language processing technique, by every picture Text description is encoded into a vector, constructs picture semantic vector matrix C, wherein Ci indicates the text description of the i-th picture Corresponding vector represents the semantic information of the picture with this vector.
Preferably, in the step S3, construct similarity matrix the step of it is as follows:
S3-1 carries out mutual inner product to label vector, to construct label using the label vector matrix constructed in step S2 Similarity matrix:
In formula (1), SlabelIt is label similarity matrix,For the label similarity between image i and image j, n table Diagram piece number, L are the label vector matrix constructed in S2, LTFor the transposed matrix of L, LtotalTotal label square between picture Battle array, wherein Ltotal[i, j] is the label number that picture i and j include in total;
S3-2 calculates the mutual cosine similarity of semantic vector using the semantic vector matrix constructed in step S2, with Construct semantic similarity matrix:
In formula (2), SsemanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n table Diagram piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | Ci| | it is vector CiMould it is long;
S3-3 constructs similarity matrix using label similarity matrix and semantic similarity matrix are as follows:
In formula (4), S is similarity matrix, si,jSimilarity between representative image i and image j, n indicate picture Number, w is weight coefficient.
Preferably, the step S4 the following steps are included:
S4-1 builds AlexNet network model using TensorFlow deep learning Open Framework, and uses ImageNet Data set carries out pre-training to AlexNet network model;
S4-2 is optimized on classical AlexNet model, constructs depth Hash neural network model;
The depth Hash Artificial Neural Network Structures built are as follows:
Include 5 convolutional layers: the first convolutional layer, the second convolutional layer, third convolutional layer, Volume Four lamination and the 5th convolution Layer;
Include 3 full articulamentums: the first full articulamentum, the second full articulamentum and Hash layer.
Build AlexNet network model using TensorFlow deep learning Open Framework, to AlexNet network model into When row pre-training, which is model adaptation backpropagation undated parameter, and centre does not need human intervention adjusting parameter, Pre-training process is simple.
Preferably, the neuron number of the Hash layer is 64.
Preferably, in the step S5, designed loss function are as follows:
In formula (8),Indicate loss function, si,jFor the similarity between image i and image j,For image i and figure As the label similarity between j, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β is to adjust surpassing for threshold value lower bound Parameter, σ are the first parameter, and δ is the second parameter, NbitsIt is the length for generating Hash vector;bi、bjRespectively indicate i-th image and The approximate Hash vector of jth image,Indicate bi、bjBetween Euclidean distance;γ is weight coefficient;Indicate dimension Degree and biThe vector that identical all elements value is 1,Respectively indicate approximate Hash vector bi、bjThe complete each element of 1 vector of absolute value vector sum between the sum of difference.Wherein, the threshold value of Lower and upper bounds follows between picture Similarity si,jVariation is adaptive, for sharing two pictures of label, Euclidean distance between corresponding approximation Hash vector, It should be between Lower and upper bounds;And for two pictures of not shared label, it can widen as far as possible between corresponding approximate Hash vector Euclidean distance;Improve image retrieval precision.
Preferably, in the step S6, depth Hash neural network model is trained using stochastic gradient descent method;
In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated ginseng of μ ' expression Number, λ indicate the amplitude that μ updates,Indicate loss function,It indicatesGradient about μ.Using stochastic gradient descent method pair Depth Hash neural network model is trained, and updates the parameter in depth Hash neural network model, improves image inspection Suo Jingdu.
Preferably, in the step S7, the depth completed has been trained to breathe out the image input step S6 in image data base Uncommon neural network model, it will obtain approximate Hash vector set B={ b1,b2,…bn, wherein n is image in image data base Number, bnFor the approximate Hash vector of n-th image;Approximate Hash vector set B is passed through into sign function, is obtained corresponding Binary system Hash vector database collection H={ h1, h2... hn, wherein n is the number of image in image data base, hnIt indicates The binary system Hash vector of n-th image.Picture feature is indicated with binary system Hash vector, improves the speed of image retrieval Degree.
Compared with prior art, beneficial effects of the present invention:
1, similarity matrix is constructed based on label vector matrix and semantic vector matrix, to merge semantic information, improves figure The retrieval precision of picture;
2, there is the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector by constructing, improve image inspection The accuracy sorted in hitch fruit;
3, the threshold value of Lower and upper bounds follows the similarity s between picturei,jVariation is adaptive, for sharing two figures of label Piece, the Euclidean distance between corresponding approximation Hash vector, it should between the Lower and upper bounds;And for two of not shared label Picture can widen the Euclidean distance between corresponding approximate Hash vector as far as possible;Improve image retrieval precision;
4, picture feature is indicated with binary system Hash vector, improve the speed of image retrieval.
Detailed description of the invention:
Fig. 1 is the depth hashing image retrieval of the fusion semantic information and multistage similitude of exemplary embodiment of the present 1 The flow chart of method;
Fig. 2 is the structure chart of the depth Hash neural network model of exemplary embodiment of the present 1.
Marked in the figure: the first convolutional layer of 11-, the second convolutional layer of 12-, 13- third convolutional layer, 14- Volume Four lamination, 15- 5th convolutional layer, the maximum pond layer of 21- first, the maximum pond layer of 22- second, 23- third maximum pond layer, 31- first connect entirely Connect layer, the full articulamentum of 32- second, 41- Hash layer.
Specific embodiment
Below with reference to test example and specific embodiment, the present invention is described in further detail.But this should not be understood It is all that this is belonged to based on the technology that the content of present invention is realized for the scope of the above subject matter of the present invention is limited to the following embodiments The range of invention.
Embodiment 1
As shown in Figure 1, the present embodiment provides the depth hashing image retrievals of a kind of fusion semantic information and multistage similitude Method specifically includes step:
S1: building image data base.
Filter out the type of the highest preceding K label of the frequency of occurrences in data set, and the figure comprising this K label Piece, for constructing image data base.
In the present embodiment, the present invention is using COCO data set disclosed in Microsoft, if each image in data set is corresponding Dry tag class (such as tag class behaviour, water, automobile, etc.).The present invention is chosen in data set frequency of occurrence (by more To few arrangement) it arranges preceding K tag class and possesses the image of the tag class, for constructing image data base.For example, The present invention chosen in COCO data set frequency of occurrence arrange preceding 20 tag class and corresponding image, for constructing the present invention Image data base.
S2: building label vector matrix and semantic vector matrix.
Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix With semantic vector matrix.
In the present embodiment, the present invention randomly selects n image and corresponding label, composing training from image data base Collection: T={ t1,t2,…,tn, tnIndicate n-th image and corresponding label, n >=1;Wherein, tn={ In,Ln},InIndicate n-th Open image, LnFor label vector, indicates the corresponding label of n-th image, be vector.Label vector is constructed with label vector Matrix L, size are n × K, and n indicates the number of image, and K indicates the number of tag class.Wherein Li,j=0 indicates i-th figure Piece does not contain j-th of label, Li,j=1 the i-th picture of expression contains j-th of label.In addition, using natural language processing technique, The text description of every picture is encoded into a vector, constructs picture semantic vector matrix C;Wherein, CiIndicate i-th figure The text of piece describes corresponding vector, and the semantic information of the picture is represented with this vector.In the present embodiment, by the text of every picture This description is encoded into the vector of one 512 dimension.
S3: building similarity matrix.
S3-1: using the label vector matrix constructed in step S2, label similarity matrix is constructed;
In the present embodiment, by all label vector LnMutual inner product is carried out, and to the knot of any two label vector inner product Fruit, the total number of labels being related to divided by corresponding image content;The result is used to construct label similarity matrix.Label similarity square Battle array size is n × n, and n indicates the number of image.
Label similarity matrix SlabelIt is expressed as following formula:
In formula (1), SlabelIt is label similarity matrix,For the label similarity between image i and image j, n table Diagram piece number, L are the label vector matrix constructed in S2, LTFor the transposed matrix of L, LtotalTotal label square between picture Battle array, wherein Ltotal[i, j] is the label number that picture i and j include in total.
S3-2: using the semantic vector matrix constructed in step S2, calculating the mutual cosine similarity of semantic vector, with Construct semantic similarity matrix;
In the present embodiment, semantic similarity matrix size is n × n, and n indicates the number of image.Semantic similarity matrix table It is shown as following formula:
In formula (2), SsemanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n table Diagram piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | Ci| | it is vector CiMould it is long.
S3-3: utilizing label similarity matrix and semantic similarity matrix, constructs similarity matrix.
In the present embodiment, similarity matrix S label similarity matrix as obtained in step S3-1 and S3-2 and semantic phase It is merged like degree matrix.Its size is n × n, and n indicates the number of image.Similarity matrix S is expressed as following formula:
In formula (4), S is similarity matrix, si,jSimilarity between representative image i and image j, n indicate picture Number, w is weight coefficient.W value is 0.5 in the present embodiment.
S4: building depth Hash neural network model, and original image is converted to approximate Hash vector.
S4-1: AlexNet network model is built using TensorFlow deep learning Open Framework, and uses ImageNet Data set carries out pre-training to AlexNet network model.
TensorFlow deep learning Open Framework is the machine learning library of opposite high-order, and user is set in which can be convenient with it Neural network structure is counted, and it supports automatic derivation, user does not need to solve gradient by backpropagation again.CNN (convolution mind Through network) be image recognition sorting algorithm core algorithm, wherein AlexNet network model be existing deep learning in terms of Classical model.AlexNet network model is built using TensorFlow deep learning Open Framework, and uses ImageNet data Collection carries out pre-training to AlexNet network model.Pre-training is carried out to AlexNet network model, i.e., to stochastic parameter AlexNet network carries out classification based training, so that its parameter is learnt to pervasive feature, which is that model adaptation is anti- To undated parameter is propagated, centre does not need human intervention adjusting parameter.
S4-2: building depth Hash neural network model.
The present embodiment optimizes on classical AlexNet model, for constructing depth Hash neural network model, with Improve the retrieval precision of image.
The present invention removes the last one full articulamentum in the AlexNet network model of pre-training, retain remaining structure and Parameter, and a new Hash layer fc is added in network toph.Wherein, fchLayer includes 64 neurons, the activation primitive of this layer It is set as tanh function, so that the value of each neuron output is between [- 1,1] in fch.The depth Hash nerve net built Network model structure is as shown in Figure 2:
Include 5 convolutional layers: the 11, second convolutional layer of the first convolutional layer (conv1) (conv2) 12, third convolutional layer (conv3) 13, Volume Four lamination (conv4) 14 and the 5th convolutional layer (conv5) 15;
Include 3 full articulamentums: the first full articulamentum (fc6) 31, the second full articulamentum (fc7) 32 and Hash layer (fch)41。
The input terminal of first convolutional layer (conv1) 11 is for inputting original image, the output of the first convolutional layer (conv1) 11 The input terminal of the first maximum pond of end connection layer 21, the output end of the first maximum pond layer 21 connect the second convolutional layer (conv2) 12 input terminal, the input terminal of the second maximum pond of output end connection layer 22 of the second convolutional layer (conv2) 12, the second maximum pond Change the input terminal of output end connection third convolutional layer (conv3) 13 of layer 22, the output end connection of third convolutional layer (conv3) 13 The output end of the input terminal of Volume Four lamination (conv4) 14, Volume Four lamination (conv4) 14 connects the 5th convolutional layer (conv5) 15 input terminal, the input terminal of the output end connection third maximum pond layer 23 of the 5th convolutional layer (conv5) 15, third maximum pond The output end for changing layer 23 connects the input terminal of the first full articulamentum (fc6) 31, the output end connection of the first full articulamentum (fc6) 31 The output end of the input terminal of second full articulamentum (fc7) 32, the second full articulamentum (fc7) 32 connects Hash layer (fch) 41 it is defeated Enter end, Hash layer (fch) 41 output end output by approximate Hash vector.
Original image is input in depth Hash neural network model by the present invention, by reflecting for convolutional layer and full articulamentum After penetrating, approximate Hash vector is obtained, the value range of every dimension is [- 1,1] in approximate Hash vector.For example, the present invention exists The image that original size is 227 × 227 × 3 is inputted in the depth Hash neural network model of building, by 5 convolutional layers and 3 The mapping of a full articulamentum will export the approximate Hash vector of one 64 dimension.In addition, the present invention can input multiple images simultaneously, To obtain multiple approximation Hash vector set B={ b1,b2,…bn, bnIndicate the approximate Hash vector of n-th image.
S5: building has the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector.
S5-1: the similarity between the corresponding approximate Hash vector of picture is calculated.
In the present embodiment, for approximation Hash vector set B={ b obtained in S41,b2,…bn, with approximation in set Euclidean distance between Hash vectorTo indicate the similarity between two approximate Hash vectors.Wherein, bi、 bjRespectively indicate the approximate Hash vector of i-th, j images.
S5-2: it is based on Euclidean distanceConstruct binary group loss function
Binary group loss functionIt is expressed as following formula:
In formula (5),Indicate binary group loss function, si,jFor the similarity between image i and image j, For the label similarity between image i and image j, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β is to adjust The hyper parameter of threshold value lower bound, σ are the first parameter, and it is the second parameter that σ, which is 2.5, δ, in the present embodiment, and δ is 1.5 in the present embodiment, NbitsIt is the length for generating Hash vector, N in the present embodimentbitsIt is 64.bi、bjRespectively indicate i-th image and jth image Approximate Hash vector,Indicate bi、bjBetween Euclidean distance.
The meaning of the binary group loss function are as follows: for two pictures of shared label, between corresponding approximation Hash vector Euclidean distance, it should between the Lower and upper bounds.Wherein, the threshold value of Lower and upper bounds follows the similarity s between picturei,jVariation is certainly It adapts to.For two pictures of not shared label, the Euclidean distance between corresponding approximate Hash vector can be widened as far as possible;Until its Greater than defined threshold value δ * NbitsWhen, it can not just generate loss.
S5-3: quantization loss is defined for approximate Hash vector
Quantization lossIt is expressed as following formula:
In formula (6),Indicate dimension and biThe vector that identical all elements value is 1, Respectively indicate approximate Hash vector bi、bjThe complete each element of 1 vector of absolute value vector sum between difference it With.
The meaning of the formula are as follows: the value of the approximate each dimension of Hash vector is closer to 1 or -1, then it is a reasonable Kazakhstan The probability of uncommon vector is higher, and the loss of generation is with regard to smaller.
S5-4: complete loss function is constructed.
In the present embodiment, loss functionBy binary group loss functionIt is lost with quantizationIt merges:
In formula (7), γ indicates the weight coefficient of quantization loss, is set as 1.0 in the present embodiment.By formula (5) and (6) generation Enter (7), complete loss function can be obtainedAs shown in formula (8):
S6: the depth Hash neural network model built is trained.
S6-1: optimization aim is constructed by loss function.
In the present embodiment, the present invention will construct optimization aim using the loss function model of building:It indicates Ask so thatValue minimum when Θ in all parameters value;Wherein, Θ is the parameter in depth Hash neural network model Set,For the loss function model of building.
S6-2: optimization aim is solved using the method for stochastic gradient descent.
In the present embodiment, the present invention solves optimization aim using the method for stochastic gradient descent, i.e., to loss functionIt asks Its gradient about parameter μ, then parameter is updated to the opposite direction of gradient, calculation formula is as follows:
In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated ginseng of μ ' expression Number,Indicate loss function,It indicatesGradient about μ.λ indicates the amplitude (i.e. learning rate) that μ updates, and may be configured as 0.0003。
It is 256 per batch of amount of training data in the present embodiment, the number of iterations is 10000 times.
S7: the Hash vector database of image is constructed.
In the present embodiment, the image input in image data base has been trained the depth Hash neural network completed by the present invention Model, it will obtain approximate Hash vector set B={ b1,b2,…bn, wherein n is the number of image in image data base, bn For the approximate Hash vector of n-th image;By approximate Hash vector set B, by sign function, (effect of function is: will be greater than Number equal to 0 is transformed into 1, and the number less than 0 is transformed into -1), obtain corresponding binary system Hash vector database collection H={ h1, h2... hn, wherein n is the number of image in image data base, hnIndicate the binary system Hash vector of n-th image.
S8: the Hash vector of image to be retrieved and the vector of Hash vector lane database are compared, similar to find out Image.
In the present embodiment, image i input to be retrieved has been trained the depth Hash neural network model completed by the present invention In, to obtain corresponding approximate Hash vector bi;Using sign function, the corresponding binary system Hash of image i is obtained Vector hi.By Hash vector hiThe progress of all Hash vectors and operation in the image Hash vector database of building, obtain respectively To corresponding as a result, and by end value by arranging from big to small;End value is bigger, shows the Hash vector and hiPhase It is higher like spending, that is, show that the corresponding image of the Hash vector is more similar to the image i that need to be retrieved, has ensured the retrieval essence of image Degree.For example, Hash vector hiThe progress of first Hash vector and operation with image Hash vector database, obtain the first knot Fruit value;Second Hash vector in Hash vector hi and image Hash vector database carries out and operation, obtains the second result Value;If the first end value is greater than the second end value, show the corresponding image of the first Hash vector and Hash vector hiIt is corresponding Image is even more like.
The above, the only detailed description of the specific embodiment of the invention, rather than limitation of the present invention.The relevant technologies The technical staff in field is not in the case where departing from principle and range of the invention, various replacements, modification and the improvement made It should all be included in the protection scope of the present invention.

Claims (8)

1. the depth hashing image search method of a kind of fusion semantic information and multistage similitude, which is characterized in that including following Step:
S1 constructs image data base;
S2 constructs label vector matrix and semantic vector matrix;
S3 constructs similarity matrix;
S4, builds depth Hash neural network model, and original image is converted to approximate Hash vector;
S5, building have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector;
S6 is trained the depth Hash neural network model built;
S7 constructs the Hash vector database of image;
S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, to find out similar figure Picture.
2. the depth hashing image search method of fusion semantic information and multistage similitude as described in claim 1, feature It is, in the step S2, constructs label vector matrix and the step of semantic vector matrix is as follows:
Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix and language Adopted vector matrix;Label vector matrix L is constructed using label information, wherein Li,j=0 indicates that the i-th picture does not contain j-th Label, Li,j=1 the i-th picture of expression contains j-th of label;Using natural language processing technique, the text of every picture is retouched It states and is encoded into a vector, construct picture semantic vector matrix C, wherein it is corresponding that Ci indicates that the text of the i-th picture describes Vector represents the semantic information of the picture with this vector.
3. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 2, feature Be, in the step S3, construct similarity matrix the step of it is as follows:
S3-1 carries out mutual inner product to label vector using the label vector matrix constructed in step S2, similar to construct label Spend matrix:
In formula (1), SlabelIt is label similarity matrix,For the label similarity between image i and image j, n indicates figure Piece number, L are the label vector matrix constructed in S2, LTFor the transposed matrix of L, LtotalTotal label matrix between picture, Middle Ltotal[i, j] is the label number that picture i and j include in total;
S3-2 calculates the mutual cosine similarity of semantic vector, using the semantic vector matrix constructed in step S2 with building Semantic similarity matrix:
In formula (2), SsemanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n indicates figure Piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | Ci| | it is vector CiMould it is long;
S3-3 constructs similarity matrix using label similarity matrix and semantic similarity matrix are as follows:
In formula (4), S is similarity matrix, si,jSimilarity between representative image i and image j, n indicate picture number, and w is Weight coefficient.
4. the depth hashing image search method of fusion semantic information and multistage similitude as described in claim 1, feature Be, the step S4 the following steps are included:
S4-1 builds AlexNet network model using TensorFlow deep learning Open Framework, and uses ImageNet data Collection carries out pre-training to AlexNet network model;
S4-2 is optimized on classical AlexNet model, constructs depth Hash neural network model;
The depth Hash Artificial Neural Network Structures built are as follows:
Include 5 convolutional layers: the first convolutional layer, the second convolutional layer, third convolutional layer, Volume Four lamination and the 5th convolutional layer;
Include 3 full articulamentums: the first full articulamentum, the second full articulamentum and Hash layer.
5. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 4, feature It is, the neuron number of the Hash layer is 64.
6. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 3, feature It is, in the step S5, designed loss function are as follows:
In formula (8),Indicate loss function, si,jFor the similarity between image i and image j,For image i and image j Between label similarity, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β be the super ginseng of adjusting threshold value lower bound Number, σ are the first parameter, and δ is the second parameter, NbitsIt is the length for generating Hash vector;bi、bjRespectively indicate i-th image and The approximate Hash vector of j images,Indicate bi、bjBetween Euclidean distance;γ is weight coefficient;Indicate dimension And biThe vector that identical all elements value is 1,Respectively indicate approximate Hash vector bi、 bjThe complete each element of 1 vector of absolute value vector sum between the sum of difference.
7. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 6, feature It is, in the step S6, depth Hash neural network model is trained using stochastic gradient descent method;
In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated parameter of μ ' expression, λ Indicate the amplitude that μ updates,Indicate loss function,It indicatesGradient about μ.
8. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 7, feature It is, in the step S7, the image input step S6 in image data base has been trained to the depth Hash neural network completed Model, it will obtain approximate Hash vector set B={ b1,b2,…bn, wherein n is the number of image in image data base, bn For the approximate Hash vector of n-th image;Approximate Hash vector set B is passed through into sign function, corresponding binary system is obtained and breathes out Uncommon vector data library set H={ h1, h2... hn, wherein n is the number of image in image data base, hnIndicate n-th image Binary system Hash vector.
CN201910211486.6A 2019-03-20 2019-03-20 Deep hash image retrieval method fusing semantic information and multilevel similarity Active CN109977250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910211486.6A CN109977250B (en) 2019-03-20 2019-03-20 Deep hash image retrieval method fusing semantic information and multilevel similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910211486.6A CN109977250B (en) 2019-03-20 2019-03-20 Deep hash image retrieval method fusing semantic information and multilevel similarity

Publications (2)

Publication Number Publication Date
CN109977250A true CN109977250A (en) 2019-07-05
CN109977250B CN109977250B (en) 2023-03-28

Family

ID=67079595

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910211486.6A Active CN109977250B (en) 2019-03-20 2019-03-20 Deep hash image retrieval method fusing semantic information and multilevel similarity

Country Status (1)

Country Link
CN (1) CN109977250B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532417A (en) * 2019-09-02 2019-12-03 河北省科学院应用数学研究所 Image search method, device and terminal device based on depth Hash
CN111143400A (en) * 2019-12-26 2020-05-12 长城计算机软件与系统有限公司 Full-stack type retrieval method, system, engine and electronic equipment
CN111709252A (en) * 2020-06-17 2020-09-25 北京百度网讯科技有限公司 Model improvement method and device based on pre-trained semantic model
CN112734386A (en) * 2021-01-13 2021-04-30 国家电网有限公司 New energy network access full-flow through method and system based on association matching algorithm
CN112765382A (en) * 2021-01-20 2021-05-07 上海依图网络科技有限公司 Image searching method, image searching device, image searching medium and electronic equipment
CN113221658A (en) * 2021-04-13 2021-08-06 卓尔智联(武汉)研究院有限公司 Training method and device of image processing model, electronic equipment and storage medium
CN113641845A (en) * 2021-07-16 2021-11-12 广西师范大学 Depth feature contrast weighted image retrieval method based on vector contrast strategy
CN114219983A (en) * 2021-12-17 2022-03-22 国家电网有限公司信息通信分公司 Neural network training method, image retrieval method and device
CN115878823A (en) * 2023-03-03 2023-03-31 中南大学 Deep hash method based on graph convolution network and traffic data retrieval method
CN116645661A (en) * 2023-07-27 2023-08-25 深圳市青虹激光科技有限公司 Method and system for detecting duplicate prevention code
CN118152914A (en) * 2024-03-18 2024-06-07 山东管理学院 Semantic structure diagram guided ECG self-coding method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834748A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Image retrieval method utilizing deep semantic to rank hash codes
CN105512289A (en) * 2015-12-07 2016-04-20 郑州金惠计算机系统工程有限公司 Image retrieval method based on deep learning and Hash
CN108399185A (en) * 2018-01-10 2018-08-14 中国科学院信息工程研究所 A kind of the binary set generation method and image, semantic similarity search method of multi-tag image
CN109165306A (en) * 2018-08-09 2019-01-08 长沙理工大学 Image search method based on the study of multitask Hash
CN109241313A (en) * 2018-08-14 2019-01-18 大连大学 A kind of image search method based on the study of high-order depth Hash
CN109284741A (en) * 2018-10-30 2019-01-29 武汉大学 A kind of extensive Remote Sensing Image Retrieval method and system based on depth Hash network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834748A (en) * 2015-05-25 2015-08-12 中国科学院自动化研究所 Image retrieval method utilizing deep semantic to rank hash codes
CN105512289A (en) * 2015-12-07 2016-04-20 郑州金惠计算机系统工程有限公司 Image retrieval method based on deep learning and Hash
CN108399185A (en) * 2018-01-10 2018-08-14 中国科学院信息工程研究所 A kind of the binary set generation method and image, semantic similarity search method of multi-tag image
CN109165306A (en) * 2018-08-09 2019-01-08 长沙理工大学 Image search method based on the study of multitask Hash
CN109241313A (en) * 2018-08-14 2019-01-18 大连大学 A kind of image search method based on the study of high-order depth Hash
CN109284741A (en) * 2018-10-30 2019-01-29 武汉大学 A kind of extensive Remote Sensing Image Retrieval method and system based on depth Hash network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
彭天强 等: "《基于深度卷积神经网络和二进制哈希学习的图像检索方法》", 《电子与信息学报》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532417B (en) * 2019-09-02 2022-03-29 河北省科学院应用数学研究所 Image retrieval method and device based on depth hash and terminal equipment
CN110532417A (en) * 2019-09-02 2019-12-03 河北省科学院应用数学研究所 Image search method, device and terminal device based on depth Hash
CN111143400A (en) * 2019-12-26 2020-05-12 长城计算机软件与系统有限公司 Full-stack type retrieval method, system, engine and electronic equipment
CN111143400B (en) * 2019-12-26 2024-05-14 新长城科技有限公司 Full stack type retrieval method, system, engine and electronic equipment
CN111709252A (en) * 2020-06-17 2020-09-25 北京百度网讯科技有限公司 Model improvement method and device based on pre-trained semantic model
US11775766B2 (en) 2020-06-17 2023-10-03 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for improving model based on pre-trained semantic model
CN111709252B (en) * 2020-06-17 2023-03-28 北京百度网讯科技有限公司 Model improvement method and device based on pre-trained semantic model
CN112734386A (en) * 2021-01-13 2021-04-30 国家电网有限公司 New energy network access full-flow through method and system based on association matching algorithm
CN112765382A (en) * 2021-01-20 2021-05-07 上海依图网络科技有限公司 Image searching method, image searching device, image searching medium and electronic equipment
CN113221658A (en) * 2021-04-13 2021-08-06 卓尔智联(武汉)研究院有限公司 Training method and device of image processing model, electronic equipment and storage medium
CN113641845B (en) * 2021-07-16 2022-09-23 广西师范大学 Depth feature contrast weighted image retrieval method based on vector contrast strategy
CN113641845A (en) * 2021-07-16 2021-11-12 广西师范大学 Depth feature contrast weighted image retrieval method based on vector contrast strategy
CN114219983A (en) * 2021-12-17 2022-03-22 国家电网有限公司信息通信分公司 Neural network training method, image retrieval method and device
CN115878823A (en) * 2023-03-03 2023-03-31 中南大学 Deep hash method based on graph convolution network and traffic data retrieval method
CN115878823B (en) * 2023-03-03 2023-04-28 中南大学 Deep hash method and traffic data retrieval method based on graph convolution network
CN116645661A (en) * 2023-07-27 2023-08-25 深圳市青虹激光科技有限公司 Method and system for detecting duplicate prevention code
CN116645661B (en) * 2023-07-27 2023-11-14 深圳市青虹激光科技有限公司 Method and system for detecting duplicate prevention code
CN118152914A (en) * 2024-03-18 2024-06-07 山东管理学院 Semantic structure diagram guided ECG self-coding method and system

Also Published As

Publication number Publication date
CN109977250B (en) 2023-03-28

Similar Documents

Publication Publication Date Title
CN109977250A (en) Merge the depth hashing image search method of semantic information and multistage similitude
Yue et al. Matching guided distillation
Biten et al. Good news, everyone! context driven entity-aware captioning for news images
CN106021364B (en) Foundation, image searching method and the device of picture searching dependency prediction model
Wang et al. Research on Web text classification algorithm based on improved CNN and SVM
CN108959396A (en) Machine reading model training method and device, answering method and device
CN110442684A (en) A kind of class case recommended method based on content of text
CN104598611B (en) The method and system being ranked up to search entry
CN108170736A (en) A kind of document based on cycle attention mechanism quickly scans qualitative method
CN109948029A (en) Based on the adaptive depth hashing image searching method of neural network
CN110059181A (en) Short text stamp methods, system, device towards extensive classification system
CN108984642B (en) Printed fabric image retrieval method based on Hash coding
CN110309839A (en) A kind of method and device of iamge description
CN107729311A (en) A kind of Chinese text feature extracting method of the fusing text tone
CN110825850B (en) Natural language theme classification method and device
CN113157919B (en) Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system
CN109960732A (en) A kind of discrete Hash cross-module state search method of depth and system based on robust supervision
Li et al. Can vision transformers perform convolution?
CN115168579A (en) Text classification method based on multi-head attention mechanism and two-dimensional convolution operation
CN109522432A (en) A kind of image search method merging adaptive similarity and Bayesian frame
Du et al. Efficient network construction through structural plasticity
CN114065769B (en) Method, device, equipment and medium for training emotion reason pair extraction model
Li et al. Multimodal fusion with co-attention mechanism
CN108805280A (en) A kind of method and apparatus of image retrieval
Chen et al. Compressing fully connected layers using Kronecker tensor decomposition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant