CN109977250A - Merge the depth hashing image search method of semantic information and multistage similitude - Google Patents
Merge the depth hashing image search method of semantic information and multistage similitude Download PDFInfo
- Publication number
- CN109977250A CN109977250A CN201910211486.6A CN201910211486A CN109977250A CN 109977250 A CN109977250 A CN 109977250A CN 201910211486 A CN201910211486 A CN 201910211486A CN 109977250 A CN109977250 A CN 109977250A
- Authority
- CN
- China
- Prior art keywords
- vector
- image
- hash
- label
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses the depth hashing image search methods of a kind of fusion semantic information and multistage similitude, include the following steps;S1 constructs image data base;S2 constructs label vector matrix and semantic vector matrix;S3 constructs similarity matrix;S4, builds depth Hash neural network model, and original image is converted to approximate Hash vector;S5, building have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector;S6 is trained the depth Hash neural network model built;S7 constructs the Hash vector database of image;S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, to find out similar image.The present invention improves the precision of image retrieval by fusion semantic information;And by the lower bound of the Hamming distances between Hash vector corresponding to two similar pictures of constraint, retrieval performance is improved.
Description
Technical field
The present invention relates to image retrieval technologies fields, and in particular to a kind of depth of fusion semantic information and multistage similitude
Hashing image search method.
Background technique
In recent years, with the development of internet, the image data of magnanimity brings huge challenge to image retrieval task.
In face of extensive and complicated image data, searching system should will also take into account retrieval while guaranteeing image retrieval quality
Efficiency, while the storage validity problem of massive information is also solved, to realize better user experience.Therefore, it studies more excellent
Image search method have very high realistic meaning.
It is more commonly used method at present that image retrieval is carried out based on depth Hash technology, it is advantageous that: pass through by
Picture is mapped as binary system Hash vector, can use the quick comparative feature of bit arithmetic, improves retrieval rate, while reducing and need to account for
Memory space.
In face of more complicated picture, traditional depth hash method exposes obvious disadvantage.On the one hand, it measures different
It is excessively coarse when similarity between image.As long as i.e. two pictures share label, it is treated as similar;It is on the contrary then dissimilar.This
Sample does not account for the semantic information that more fine-grained similarity grade and picture are contained between picture.On the other hand, right
In traditional binary group loss function, only the hamming between the Hash vector with a specified threshold to constrain two similar pictures away from
From the upper bound, there is no any constraint to lower bound.This allows for that there is the relative distance between the image of different degrees of similitude can not protect
Card also results in the reduction of sequence accuracy in image searching result.
Summary of the invention
It is an object of the invention to overcome the above-mentioned deficiency in the presence of the prior art, provide a kind of fusion semantic information and
The depth hashing image search method of multistage similitude, to merge semantic information, it is contemplated that more fine-grained similarity between picture
Grade improves the retrieval precision of image;And by using new binary group loss function, construct to similar pictures Hash vector
Hamming distances have the loss function of lower bound constrained, improve the accuracy sorted in image searching result.
In order to achieve the above-mentioned object of the invention, the present invention provides following technical schemes:
A kind of depth hashing image search method of fusion semantic information and multistage similitude, comprising the following steps:
S1 constructs image data base;
S2 constructs label vector matrix and semantic vector matrix;
S3 constructs similarity matrix;
S4, builds depth Hash neural network model, and original image is converted to approximate Hash vector;
S5, building have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector;
S6 is trained the depth Hash neural network model built;
S7 constructs the Hash vector database of image;
S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, similar to find out
Image.Image is improved based on label vector matrix and semantic vector matrix building similarity matrix to merge semantic information
Retrieval precision;And have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector by constructing, improve image
The accuracy sorted in search result.
Preferably, it in the step S2, constructs label vector matrix and the step of semantic vector matrix is as follows:
Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix
With semantic vector matrix;Label vector matrix L is constructed using label information, wherein Li,j=0 the i-th picture of expression does not contain
J-th of label, Li,j=1 the i-th picture of expression contains j-th of label;Using natural language processing technique, by every picture
Text description is encoded into a vector, constructs picture semantic vector matrix C, wherein Ci indicates the text description of the i-th picture
Corresponding vector represents the semantic information of the picture with this vector.
Preferably, in the step S3, construct similarity matrix the step of it is as follows:
S3-1 carries out mutual inner product to label vector, to construct label using the label vector matrix constructed in step S2
Similarity matrix:
In formula (1), SlabelIt is label similarity matrix,For the label similarity between image i and image j, n table
Diagram piece number, L are the label vector matrix constructed in S2, LTFor the transposed matrix of L, LtotalTotal label square between picture
Battle array, wherein Ltotal[i, j] is the label number that picture i and j include in total;
S3-2 calculates the mutual cosine similarity of semantic vector using the semantic vector matrix constructed in step S2, with
Construct semantic similarity matrix:
In formula (2), SsemanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n table
Diagram piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | Ci| | it is vector CiMould it is long;
S3-3 constructs similarity matrix using label similarity matrix and semantic similarity matrix are as follows:
In formula (4), S is similarity matrix, si,jSimilarity between representative image i and image j, n indicate picture
Number, w is weight coefficient.
Preferably, the step S4 the following steps are included:
S4-1 builds AlexNet network model using TensorFlow deep learning Open Framework, and uses ImageNet
Data set carries out pre-training to AlexNet network model;
S4-2 is optimized on classical AlexNet model, constructs depth Hash neural network model;
The depth Hash Artificial Neural Network Structures built are as follows:
Include 5 convolutional layers: the first convolutional layer, the second convolutional layer, third convolutional layer, Volume Four lamination and the 5th convolution
Layer;
Include 3 full articulamentums: the first full articulamentum, the second full articulamentum and Hash layer.
Build AlexNet network model using TensorFlow deep learning Open Framework, to AlexNet network model into
When row pre-training, which is model adaptation backpropagation undated parameter, and centre does not need human intervention adjusting parameter,
Pre-training process is simple.
Preferably, the neuron number of the Hash layer is 64.
Preferably, in the step S5, designed loss function are as follows:
In formula (8),Indicate loss function, si,jFor the similarity between image i and image j,For image i and figure
As the label similarity between j, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β is to adjust surpassing for threshold value lower bound
Parameter, σ are the first parameter, and δ is the second parameter, NbitsIt is the length for generating Hash vector;bi、bjRespectively indicate i-th image and
The approximate Hash vector of jth image,Indicate bi、bjBetween Euclidean distance;γ is weight coefficient;Indicate dimension
Degree and biThe vector that identical all elements value is 1,Respectively indicate approximate Hash vector
bi、bjThe complete each element of 1 vector of absolute value vector sum between the sum of difference.Wherein, the threshold value of Lower and upper bounds follows between picture
Similarity si,jVariation is adaptive, for sharing two pictures of label, Euclidean distance between corresponding approximation Hash vector,
It should be between Lower and upper bounds;And for two pictures of not shared label, it can widen as far as possible between corresponding approximate Hash vector
Euclidean distance;Improve image retrieval precision.
Preferably, in the step S6, depth Hash neural network model is trained using stochastic gradient descent method;
In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated ginseng of μ ' expression
Number, λ indicate the amplitude that μ updates,Indicate loss function,It indicatesGradient about μ.Using stochastic gradient descent method pair
Depth Hash neural network model is trained, and updates the parameter in depth Hash neural network model, improves image inspection
Suo Jingdu.
Preferably, in the step S7, the depth completed has been trained to breathe out the image input step S6 in image data base
Uncommon neural network model, it will obtain approximate Hash vector set B={ b1,b2,…bn, wherein n is image in image data base
Number, bnFor the approximate Hash vector of n-th image;Approximate Hash vector set B is passed through into sign function, is obtained corresponding
Binary system Hash vector database collection H={ h1, h2... hn, wherein n is the number of image in image data base, hnIt indicates
The binary system Hash vector of n-th image.Picture feature is indicated with binary system Hash vector, improves the speed of image retrieval
Degree.
Compared with prior art, beneficial effects of the present invention:
1, similarity matrix is constructed based on label vector matrix and semantic vector matrix, to merge semantic information, improves figure
The retrieval precision of picture;
2, there is the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector by constructing, improve image inspection
The accuracy sorted in hitch fruit;
3, the threshold value of Lower and upper bounds follows the similarity s between picturei,jVariation is adaptive, for sharing two figures of label
Piece, the Euclidean distance between corresponding approximation Hash vector, it should between the Lower and upper bounds;And for two of not shared label
Picture can widen the Euclidean distance between corresponding approximate Hash vector as far as possible;Improve image retrieval precision;
4, picture feature is indicated with binary system Hash vector, improve the speed of image retrieval.
Detailed description of the invention:
Fig. 1 is the depth hashing image retrieval of the fusion semantic information and multistage similitude of exemplary embodiment of the present 1
The flow chart of method;
Fig. 2 is the structure chart of the depth Hash neural network model of exemplary embodiment of the present 1.
Marked in the figure: the first convolutional layer of 11-, the second convolutional layer of 12-, 13- third convolutional layer, 14- Volume Four lamination, 15-
5th convolutional layer, the maximum pond layer of 21- first, the maximum pond layer of 22- second, 23- third maximum pond layer, 31- first connect entirely
Connect layer, the full articulamentum of 32- second, 41- Hash layer.
Specific embodiment
Below with reference to test example and specific embodiment, the present invention is described in further detail.But this should not be understood
It is all that this is belonged to based on the technology that the content of present invention is realized for the scope of the above subject matter of the present invention is limited to the following embodiments
The range of invention.
Embodiment 1
As shown in Figure 1, the present embodiment provides the depth hashing image retrievals of a kind of fusion semantic information and multistage similitude
Method specifically includes step:
S1: building image data base.
Filter out the type of the highest preceding K label of the frequency of occurrences in data set, and the figure comprising this K label
Piece, for constructing image data base.
In the present embodiment, the present invention is using COCO data set disclosed in Microsoft, if each image in data set is corresponding
Dry tag class (such as tag class behaviour, water, automobile, etc.).The present invention is chosen in data set frequency of occurrence (by more
To few arrangement) it arranges preceding K tag class and possesses the image of the tag class, for constructing image data base.For example,
The present invention chosen in COCO data set frequency of occurrence arrange preceding 20 tag class and corresponding image, for constructing the present invention
Image data base.
S2: building label vector matrix and semantic vector matrix.
Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix
With semantic vector matrix.
In the present embodiment, the present invention randomly selects n image and corresponding label, composing training from image data base
Collection: T={ t1,t2,…,tn, tnIndicate n-th image and corresponding label, n >=1;Wherein, tn={ In,Ln},InIndicate n-th
Open image, LnFor label vector, indicates the corresponding label of n-th image, be vector.Label vector is constructed with label vector
Matrix L, size are n × K, and n indicates the number of image, and K indicates the number of tag class.Wherein Li,j=0 indicates i-th figure
Piece does not contain j-th of label, Li,j=1 the i-th picture of expression contains j-th of label.In addition, using natural language processing technique,
The text description of every picture is encoded into a vector, constructs picture semantic vector matrix C;Wherein, CiIndicate i-th figure
The text of piece describes corresponding vector, and the semantic information of the picture is represented with this vector.In the present embodiment, by the text of every picture
This description is encoded into the vector of one 512 dimension.
S3: building similarity matrix.
S3-1: using the label vector matrix constructed in step S2, label similarity matrix is constructed;
In the present embodiment, by all label vector LnMutual inner product is carried out, and to the knot of any two label vector inner product
Fruit, the total number of labels being related to divided by corresponding image content;The result is used to construct label similarity matrix.Label similarity square
Battle array size is n × n, and n indicates the number of image.
Label similarity matrix SlabelIt is expressed as following formula:
In formula (1), SlabelIt is label similarity matrix,For the label similarity between image i and image j, n table
Diagram piece number, L are the label vector matrix constructed in S2, LTFor the transposed matrix of L, LtotalTotal label square between picture
Battle array, wherein Ltotal[i, j] is the label number that picture i and j include in total.
S3-2: using the semantic vector matrix constructed in step S2, calculating the mutual cosine similarity of semantic vector, with
Construct semantic similarity matrix;
In the present embodiment, semantic similarity matrix size is n × n, and n indicates the number of image.Semantic similarity matrix table
It is shown as following formula:
In formula (2), SsemanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n table
Diagram piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | Ci| | it is vector CiMould it is long.
S3-3: utilizing label similarity matrix and semantic similarity matrix, constructs similarity matrix.
In the present embodiment, similarity matrix S label similarity matrix as obtained in step S3-1 and S3-2 and semantic phase
It is merged like degree matrix.Its size is n × n, and n indicates the number of image.Similarity matrix S is expressed as following formula:
In formula (4), S is similarity matrix, si,jSimilarity between representative image i and image j, n indicate picture
Number, w is weight coefficient.W value is 0.5 in the present embodiment.
S4: building depth Hash neural network model, and original image is converted to approximate Hash vector.
S4-1: AlexNet network model is built using TensorFlow deep learning Open Framework, and uses ImageNet
Data set carries out pre-training to AlexNet network model.
TensorFlow deep learning Open Framework is the machine learning library of opposite high-order, and user is set in which can be convenient with it
Neural network structure is counted, and it supports automatic derivation, user does not need to solve gradient by backpropagation again.CNN (convolution mind
Through network) be image recognition sorting algorithm core algorithm, wherein AlexNet network model be existing deep learning in terms of
Classical model.AlexNet network model is built using TensorFlow deep learning Open Framework, and uses ImageNet data
Collection carries out pre-training to AlexNet network model.Pre-training is carried out to AlexNet network model, i.e., to stochastic parameter
AlexNet network carries out classification based training, so that its parameter is learnt to pervasive feature, which is that model adaptation is anti-
To undated parameter is propagated, centre does not need human intervention adjusting parameter.
S4-2: building depth Hash neural network model.
The present embodiment optimizes on classical AlexNet model, for constructing depth Hash neural network model, with
Improve the retrieval precision of image.
The present invention removes the last one full articulamentum in the AlexNet network model of pre-training, retain remaining structure and
Parameter, and a new Hash layer fc is added in network toph.Wherein, fchLayer includes 64 neurons, the activation primitive of this layer
It is set as tanh function, so that the value of each neuron output is between [- 1,1] in fch.The depth Hash nerve net built
Network model structure is as shown in Figure 2:
Include 5 convolutional layers: the 11, second convolutional layer of the first convolutional layer (conv1) (conv2) 12, third convolutional layer
(conv3) 13, Volume Four lamination (conv4) 14 and the 5th convolutional layer (conv5) 15;
Include 3 full articulamentums: the first full articulamentum (fc6) 31, the second full articulamentum (fc7) 32 and Hash layer
(fch)41。
The input terminal of first convolutional layer (conv1) 11 is for inputting original image, the output of the first convolutional layer (conv1) 11
The input terminal of the first maximum pond of end connection layer 21, the output end of the first maximum pond layer 21 connect the second convolutional layer (conv2)
12 input terminal, the input terminal of the second maximum pond of output end connection layer 22 of the second convolutional layer (conv2) 12, the second maximum pond
Change the input terminal of output end connection third convolutional layer (conv3) 13 of layer 22, the output end connection of third convolutional layer (conv3) 13
The output end of the input terminal of Volume Four lamination (conv4) 14, Volume Four lamination (conv4) 14 connects the 5th convolutional layer (conv5)
15 input terminal, the input terminal of the output end connection third maximum pond layer 23 of the 5th convolutional layer (conv5) 15, third maximum pond
The output end for changing layer 23 connects the input terminal of the first full articulamentum (fc6) 31, the output end connection of the first full articulamentum (fc6) 31
The output end of the input terminal of second full articulamentum (fc7) 32, the second full articulamentum (fc7) 32 connects Hash layer (fch) 41 it is defeated
Enter end, Hash layer (fch) 41 output end output by approximate Hash vector.
Original image is input in depth Hash neural network model by the present invention, by reflecting for convolutional layer and full articulamentum
After penetrating, approximate Hash vector is obtained, the value range of every dimension is [- 1,1] in approximate Hash vector.For example, the present invention exists
The image that original size is 227 × 227 × 3 is inputted in the depth Hash neural network model of building, by 5 convolutional layers and 3
The mapping of a full articulamentum will export the approximate Hash vector of one 64 dimension.In addition, the present invention can input multiple images simultaneously,
To obtain multiple approximation Hash vector set B={ b1,b2,…bn, bnIndicate the approximate Hash vector of n-th image.
S5: building has the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector.
S5-1: the similarity between the corresponding approximate Hash vector of picture is calculated.
In the present embodiment, for approximation Hash vector set B={ b obtained in S41,b2,…bn, with approximation in set
Euclidean distance between Hash vectorTo indicate the similarity between two approximate Hash vectors.Wherein, bi、
bjRespectively indicate the approximate Hash vector of i-th, j images.
S5-2: it is based on Euclidean distanceConstruct binary group loss function
Binary group loss functionIt is expressed as following formula:
In formula (5),Indicate binary group loss function, si,jFor the similarity between image i and image j,
For the label similarity between image i and image j, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β is to adjust
The hyper parameter of threshold value lower bound, σ are the first parameter, and it is the second parameter that σ, which is 2.5, δ, in the present embodiment, and δ is 1.5 in the present embodiment,
NbitsIt is the length for generating Hash vector, N in the present embodimentbitsIt is 64.bi、bjRespectively indicate i-th image and jth image
Approximate Hash vector,Indicate bi、bjBetween Euclidean distance.
The meaning of the binary group loss function are as follows: for two pictures of shared label, between corresponding approximation Hash vector
Euclidean distance, it should between the Lower and upper bounds.Wherein, the threshold value of Lower and upper bounds follows the similarity s between picturei,jVariation is certainly
It adapts to.For two pictures of not shared label, the Euclidean distance between corresponding approximate Hash vector can be widened as far as possible;Until its
Greater than defined threshold value δ * NbitsWhen, it can not just generate loss.
S5-3: quantization loss is defined for approximate Hash vector
Quantization lossIt is expressed as following formula:
In formula (6),Indicate dimension and biThe vector that identical all elements value is 1, Respectively indicate approximate Hash vector bi、bjThe complete each element of 1 vector of absolute value vector sum between difference it
With.
The meaning of the formula are as follows: the value of the approximate each dimension of Hash vector is closer to 1 or -1, then it is a reasonable Kazakhstan
The probability of uncommon vector is higher, and the loss of generation is with regard to smaller.
S5-4: complete loss function is constructed.
In the present embodiment, loss functionBy binary group loss functionIt is lost with quantizationIt merges:
In formula (7), γ indicates the weight coefficient of quantization loss, is set as 1.0 in the present embodiment.By formula (5) and (6) generation
Enter (7), complete loss function can be obtainedAs shown in formula (8):
S6: the depth Hash neural network model built is trained.
S6-1: optimization aim is constructed by loss function.
In the present embodiment, the present invention will construct optimization aim using the loss function model of building:It indicates
Ask so thatValue minimum when Θ in all parameters value;Wherein, Θ is the parameter in depth Hash neural network model
Set,For the loss function model of building.
S6-2: optimization aim is solved using the method for stochastic gradient descent.
In the present embodiment, the present invention solves optimization aim using the method for stochastic gradient descent, i.e., to loss functionIt asks
Its gradient about parameter μ, then parameter is updated to the opposite direction of gradient, calculation formula is as follows:
In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated ginseng of μ ' expression
Number,Indicate loss function,It indicatesGradient about μ.λ indicates the amplitude (i.e. learning rate) that μ updates, and may be configured as
0.0003。
It is 256 per batch of amount of training data in the present embodiment, the number of iterations is 10000 times.
S7: the Hash vector database of image is constructed.
In the present embodiment, the image input in image data base has been trained the depth Hash neural network completed by the present invention
Model, it will obtain approximate Hash vector set B={ b1,b2,…bn, wherein n is the number of image in image data base, bn
For the approximate Hash vector of n-th image;By approximate Hash vector set B, by sign function, (effect of function is: will be greater than
Number equal to 0 is transformed into 1, and the number less than 0 is transformed into -1), obtain corresponding binary system Hash vector database collection H={ h1,
h2... hn, wherein n is the number of image in image data base, hnIndicate the binary system Hash vector of n-th image.
S8: the Hash vector of image to be retrieved and the vector of Hash vector lane database are compared, similar to find out
Image.
In the present embodiment, image i input to be retrieved has been trained the depth Hash neural network model completed by the present invention
In, to obtain corresponding approximate Hash vector bi;Using sign function, the corresponding binary system Hash of image i is obtained
Vector hi.By Hash vector hiThe progress of all Hash vectors and operation in the image Hash vector database of building, obtain respectively
To corresponding as a result, and by end value by arranging from big to small;End value is bigger, shows the Hash vector and hiPhase
It is higher like spending, that is, show that the corresponding image of the Hash vector is more similar to the image i that need to be retrieved, has ensured the retrieval essence of image
Degree.For example, Hash vector hiThe progress of first Hash vector and operation with image Hash vector database, obtain the first knot
Fruit value;Second Hash vector in Hash vector hi and image Hash vector database carries out and operation, obtains the second result
Value;If the first end value is greater than the second end value, show the corresponding image of the first Hash vector and Hash vector hiIt is corresponding
Image is even more like.
The above, the only detailed description of the specific embodiment of the invention, rather than limitation of the present invention.The relevant technologies
The technical staff in field is not in the case where departing from principle and range of the invention, various replacements, modification and the improvement made
It should all be included in the protection scope of the present invention.
Claims (8)
1. the depth hashing image search method of a kind of fusion semantic information and multistage similitude, which is characterized in that including following
Step:
S1 constructs image data base;
S2 constructs label vector matrix and semantic vector matrix;
S3 constructs similarity matrix;
S4, builds depth Hash neural network model, and original image is converted to approximate Hash vector;
S5, building have the loss function of lower bound constrained to the Hamming distances of similar pictures Hash vector;
S6 is trained the depth Hash neural network model built;
S7 constructs the Hash vector database of image;
S8 compares the Hash vector of image to be retrieved and the vector of Hash vector lane database, to find out similar figure
Picture.
2. the depth hashing image search method of fusion semantic information and multistage similitude as described in claim 1, feature
It is, in the step S2, constructs label vector matrix and the step of semantic vector matrix is as follows:
Image and corresponding label and text description are randomly selected, from image data base to construct label vector matrix and language
Adopted vector matrix;Label vector matrix L is constructed using label information, wherein Li,j=0 indicates that the i-th picture does not contain j-th
Label, Li,j=1 the i-th picture of expression contains j-th of label;Using natural language processing technique, the text of every picture is retouched
It states and is encoded into a vector, construct picture semantic vector matrix C, wherein it is corresponding that Ci indicates that the text of the i-th picture describes
Vector represents the semantic information of the picture with this vector.
3. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 2, feature
Be, in the step S3, construct similarity matrix the step of it is as follows:
S3-1 carries out mutual inner product to label vector using the label vector matrix constructed in step S2, similar to construct label
Spend matrix:
In formula (1), SlabelIt is label similarity matrix,For the label similarity between image i and image j, n indicates figure
Piece number, L are the label vector matrix constructed in S2, LTFor the transposed matrix of L, LtotalTotal label matrix between picture,
Middle Ltotal[i, j] is the label number that picture i and j include in total;
S3-2 calculates the mutual cosine similarity of semantic vector, using the semantic vector matrix constructed in step S2 with building
Semantic similarity matrix:
In formula (2), SsemanIt is semantic similarity matrix,It is the semantic similarity between image i and image j, n indicates figure
Piece number, in formula (3), C is the semantic vector matrix constructed in S2, | | Ci| | it is vector CiMould it is long;
S3-3 constructs similarity matrix using label similarity matrix and semantic similarity matrix are as follows:
In formula (4), S is similarity matrix, si,jSimilarity between representative image i and image j, n indicate picture number, and w is
Weight coefficient.
4. the depth hashing image search method of fusion semantic information and multistage similitude as described in claim 1, feature
Be, the step S4 the following steps are included:
S4-1 builds AlexNet network model using TensorFlow deep learning Open Framework, and uses ImageNet data
Collection carries out pre-training to AlexNet network model;
S4-2 is optimized on classical AlexNet model, constructs depth Hash neural network model;
The depth Hash Artificial Neural Network Structures built are as follows:
Include 5 convolutional layers: the first convolutional layer, the second convolutional layer, third convolutional layer, Volume Four lamination and the 5th convolutional layer;
Include 3 full articulamentums: the first full articulamentum, the second full articulamentum and Hash layer.
5. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 4, feature
It is, the neuron number of the Hash layer is 64.
6. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 3, feature
It is, in the step S5, designed loss function are as follows:
In formula (8),Indicate loss function, si,jFor the similarity between image i and image j,For image i and image j
Between label similarity, S is similarity matrix, and α is the hyper parameter for adjusting the threshold value upper bound, and β be the super ginseng of adjusting threshold value lower bound
Number, σ are the first parameter, and δ is the second parameter, NbitsIt is the length for generating Hash vector;bi、bjRespectively indicate i-th image and
The approximate Hash vector of j images,Indicate bi、bjBetween Euclidean distance;γ is weight coefficient;Indicate dimension
And biThe vector that identical all elements value is 1,Respectively indicate approximate Hash vector bi、
bjThe complete each element of 1 vector of absolute value vector sum between the sum of difference.
7. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 6, feature
It is, in the step S6, depth Hash neural network model is trained using stochastic gradient descent method;
In formula (9), μ indicates any one parameter in depth Hash neural network model, the updated parameter of μ ' expression, λ
Indicate the amplitude that μ updates,Indicate loss function,It indicatesGradient about μ.
8. the depth hashing image search method of fusion semantic information and multistage similitude as claimed in claim 7, feature
It is, in the step S7, the image input step S6 in image data base has been trained to the depth Hash neural network completed
Model, it will obtain approximate Hash vector set B={ b1,b2,…bn, wherein n is the number of image in image data base, bn
For the approximate Hash vector of n-th image;Approximate Hash vector set B is passed through into sign function, corresponding binary system is obtained and breathes out
Uncommon vector data library set H={ h1, h2... hn, wherein n is the number of image in image data base, hnIndicate n-th image
Binary system Hash vector.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910211486.6A CN109977250B (en) | 2019-03-20 | 2019-03-20 | Deep hash image retrieval method fusing semantic information and multilevel similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910211486.6A CN109977250B (en) | 2019-03-20 | 2019-03-20 | Deep hash image retrieval method fusing semantic information and multilevel similarity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109977250A true CN109977250A (en) | 2019-07-05 |
CN109977250B CN109977250B (en) | 2023-03-28 |
Family
ID=67079595
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910211486.6A Active CN109977250B (en) | 2019-03-20 | 2019-03-20 | Deep hash image retrieval method fusing semantic information and multilevel similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109977250B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532417A (en) * | 2019-09-02 | 2019-12-03 | 河北省科学院应用数学研究所 | Image search method, device and terminal device based on depth Hash |
CN111143400A (en) * | 2019-12-26 | 2020-05-12 | 长城计算机软件与系统有限公司 | Full-stack type retrieval method, system, engine and electronic equipment |
CN111709252A (en) * | 2020-06-17 | 2020-09-25 | 北京百度网讯科技有限公司 | Model improvement method and device based on pre-trained semantic model |
CN112734386A (en) * | 2021-01-13 | 2021-04-30 | 国家电网有限公司 | New energy network access full-flow through method and system based on association matching algorithm |
CN112765382A (en) * | 2021-01-20 | 2021-05-07 | 上海依图网络科技有限公司 | Image searching method, image searching device, image searching medium and electronic equipment |
CN113221658A (en) * | 2021-04-13 | 2021-08-06 | 卓尔智联(武汉)研究院有限公司 | Training method and device of image processing model, electronic equipment and storage medium |
CN113641845A (en) * | 2021-07-16 | 2021-11-12 | 广西师范大学 | Depth feature contrast weighted image retrieval method based on vector contrast strategy |
CN114219983A (en) * | 2021-12-17 | 2022-03-22 | 国家电网有限公司信息通信分公司 | Neural network training method, image retrieval method and device |
CN115878823A (en) * | 2023-03-03 | 2023-03-31 | 中南大学 | Deep hash method based on graph convolution network and traffic data retrieval method |
CN116645661A (en) * | 2023-07-27 | 2023-08-25 | 深圳市青虹激光科技有限公司 | Method and system for detecting duplicate prevention code |
CN118152914A (en) * | 2024-03-18 | 2024-06-07 | 山东管理学院 | Semantic structure diagram guided ECG self-coding method and system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834748A (en) * | 2015-05-25 | 2015-08-12 | 中国科学院自动化研究所 | Image retrieval method utilizing deep semantic to rank hash codes |
CN105512289A (en) * | 2015-12-07 | 2016-04-20 | 郑州金惠计算机系统工程有限公司 | Image retrieval method based on deep learning and Hash |
CN108399185A (en) * | 2018-01-10 | 2018-08-14 | 中国科学院信息工程研究所 | A kind of the binary set generation method and image, semantic similarity search method of multi-tag image |
CN109165306A (en) * | 2018-08-09 | 2019-01-08 | 长沙理工大学 | Image search method based on the study of multitask Hash |
CN109241313A (en) * | 2018-08-14 | 2019-01-18 | 大连大学 | A kind of image search method based on the study of high-order depth Hash |
CN109284741A (en) * | 2018-10-30 | 2019-01-29 | 武汉大学 | A kind of extensive Remote Sensing Image Retrieval method and system based on depth Hash network |
-
2019
- 2019-03-20 CN CN201910211486.6A patent/CN109977250B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104834748A (en) * | 2015-05-25 | 2015-08-12 | 中国科学院自动化研究所 | Image retrieval method utilizing deep semantic to rank hash codes |
CN105512289A (en) * | 2015-12-07 | 2016-04-20 | 郑州金惠计算机系统工程有限公司 | Image retrieval method based on deep learning and Hash |
CN108399185A (en) * | 2018-01-10 | 2018-08-14 | 中国科学院信息工程研究所 | A kind of the binary set generation method and image, semantic similarity search method of multi-tag image |
CN109165306A (en) * | 2018-08-09 | 2019-01-08 | 长沙理工大学 | Image search method based on the study of multitask Hash |
CN109241313A (en) * | 2018-08-14 | 2019-01-18 | 大连大学 | A kind of image search method based on the study of high-order depth Hash |
CN109284741A (en) * | 2018-10-30 | 2019-01-29 | 武汉大学 | A kind of extensive Remote Sensing Image Retrieval method and system based on depth Hash network |
Non-Patent Citations (1)
Title |
---|
彭天强 等: "《基于深度卷积神经网络和二进制哈希学习的图像检索方法》", 《电子与信息学报》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532417B (en) * | 2019-09-02 | 2022-03-29 | 河北省科学院应用数学研究所 | Image retrieval method and device based on depth hash and terminal equipment |
CN110532417A (en) * | 2019-09-02 | 2019-12-03 | 河北省科学院应用数学研究所 | Image search method, device and terminal device based on depth Hash |
CN111143400A (en) * | 2019-12-26 | 2020-05-12 | 长城计算机软件与系统有限公司 | Full-stack type retrieval method, system, engine and electronic equipment |
CN111143400B (en) * | 2019-12-26 | 2024-05-14 | 新长城科技有限公司 | Full stack type retrieval method, system, engine and electronic equipment |
CN111709252A (en) * | 2020-06-17 | 2020-09-25 | 北京百度网讯科技有限公司 | Model improvement method and device based on pre-trained semantic model |
US11775766B2 (en) | 2020-06-17 | 2023-10-03 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for improving model based on pre-trained semantic model |
CN111709252B (en) * | 2020-06-17 | 2023-03-28 | 北京百度网讯科技有限公司 | Model improvement method and device based on pre-trained semantic model |
CN112734386A (en) * | 2021-01-13 | 2021-04-30 | 国家电网有限公司 | New energy network access full-flow through method and system based on association matching algorithm |
CN112765382A (en) * | 2021-01-20 | 2021-05-07 | 上海依图网络科技有限公司 | Image searching method, image searching device, image searching medium and electronic equipment |
CN113221658A (en) * | 2021-04-13 | 2021-08-06 | 卓尔智联(武汉)研究院有限公司 | Training method and device of image processing model, electronic equipment and storage medium |
CN113641845B (en) * | 2021-07-16 | 2022-09-23 | 广西师范大学 | Depth feature contrast weighted image retrieval method based on vector contrast strategy |
CN113641845A (en) * | 2021-07-16 | 2021-11-12 | 广西师范大学 | Depth feature contrast weighted image retrieval method based on vector contrast strategy |
CN114219983A (en) * | 2021-12-17 | 2022-03-22 | 国家电网有限公司信息通信分公司 | Neural network training method, image retrieval method and device |
CN115878823A (en) * | 2023-03-03 | 2023-03-31 | 中南大学 | Deep hash method based on graph convolution network and traffic data retrieval method |
CN115878823B (en) * | 2023-03-03 | 2023-04-28 | 中南大学 | Deep hash method and traffic data retrieval method based on graph convolution network |
CN116645661A (en) * | 2023-07-27 | 2023-08-25 | 深圳市青虹激光科技有限公司 | Method and system for detecting duplicate prevention code |
CN116645661B (en) * | 2023-07-27 | 2023-11-14 | 深圳市青虹激光科技有限公司 | Method and system for detecting duplicate prevention code |
CN118152914A (en) * | 2024-03-18 | 2024-06-07 | 山东管理学院 | Semantic structure diagram guided ECG self-coding method and system |
Also Published As
Publication number | Publication date |
---|---|
CN109977250B (en) | 2023-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109977250A (en) | Merge the depth hashing image search method of semantic information and multistage similitude | |
Yue et al. | Matching guided distillation | |
Biten et al. | Good news, everyone! context driven entity-aware captioning for news images | |
CN106021364B (en) | Foundation, image searching method and the device of picture searching dependency prediction model | |
Wang et al. | Research on Web text classification algorithm based on improved CNN and SVM | |
CN108959396A (en) | Machine reading model training method and device, answering method and device | |
CN110442684A (en) | A kind of class case recommended method based on content of text | |
CN104598611B (en) | The method and system being ranked up to search entry | |
CN108170736A (en) | A kind of document based on cycle attention mechanism quickly scans qualitative method | |
CN109948029A (en) | Based on the adaptive depth hashing image searching method of neural network | |
CN110059181A (en) | Short text stamp methods, system, device towards extensive classification system | |
CN108984642B (en) | Printed fabric image retrieval method based on Hash coding | |
CN110309839A (en) | A kind of method and device of iamge description | |
CN107729311A (en) | A kind of Chinese text feature extracting method of the fusing text tone | |
CN110825850B (en) | Natural language theme classification method and device | |
CN113157919B (en) | Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system | |
CN109960732A (en) | A kind of discrete Hash cross-module state search method of depth and system based on robust supervision | |
Li et al. | Can vision transformers perform convolution? | |
CN115168579A (en) | Text classification method based on multi-head attention mechanism and two-dimensional convolution operation | |
CN109522432A (en) | A kind of image search method merging adaptive similarity and Bayesian frame | |
Du et al. | Efficient network construction through structural plasticity | |
CN114065769B (en) | Method, device, equipment and medium for training emotion reason pair extraction model | |
Li et al. | Multimodal fusion with co-attention mechanism | |
CN108805280A (en) | A kind of method and apparatus of image retrieval | |
Chen et al. | Compressing fully connected layers using Kronecker tensor decomposition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |