CN113326393B - Image retrieval method based on deep hash feature and heterogeneous parallel processing - Google Patents

Image retrieval method based on deep hash feature and heterogeneous parallel processing Download PDF

Info

Publication number
CN113326393B
CN113326393B CN202110600390.6A CN202110600390A CN113326393B CN 113326393 B CN113326393 B CN 113326393B CN 202110600390 A CN202110600390 A CN 202110600390A CN 113326393 B CN113326393 B CN 113326393B
Authority
CN
China
Prior art keywords
hash
image
binary
deep
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202110600390.6A
Other languages
Chinese (zh)
Other versions
CN113326393A (en
Inventor
廖开阳
陈星�
曹从军
章明珠
王睿天
罗晓洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Foresight Information Co ltd
Xi'an Huaqi Zhongxin Technology Development Co ltd
Original Assignee
Shenzhen Foresight Information Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Foresight Information Co ltd filed Critical Shenzhen Foresight Information Co ltd
Priority to CN202110600390.6A priority Critical patent/CN113326393B/en
Publication of CN113326393A publication Critical patent/CN113326393A/en
Application granted granted Critical
Publication of CN113326393B publication Critical patent/CN113326393B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image retrieval method based on deep hash characteristics and heterogeneous parallel processing, which is implemented according to the following steps: step 1, training a deep Hash network model; step 2, sending the test set and the query image into a trained network model to obtain the depth hash characteristics of the test set and the query image, namely binary hash codes; step 3, calculating Hamming distances between the binary Hash codes of the test set and the query image obtained in the step 2, and sequencing the Hamming distances in an ascending order to obtain a primary ranking result; and 4, selecting binary Hash codes of the previous p images in the initial ranking result, performing Hamming distance calculation again with the binary codes of the query image, and sequencing according to the Hamming distance ascending sequence to obtain a reordering result, namely finally obtaining q retrieval results most similar to the query image. The image retrieval method based on the deep hash characteristic and heterogeneous parallel processing solves the problem of low image retrieval precision in the prior art.

Description

Image retrieval method based on deep hash feature and heterogeneous parallel processing
Technical Field
The invention belongs to the technical field of computer image retrieval methods, and relates to an image retrieval method based on a deep hash feature and heterogeneous parallel processing.
Background
With the rapid development of storage devices, computer networks, and multimedia technologies, image data that people have come in contact with and make is increasing. Finding out the image that the user wants quickly and accurately in the massive database has become a hot spot of current research, and therefore, the image retrieval technology is also concerned and develops rapidly. There are also two important challenges for such applications: (1) Image features are usually high-dimensional data, storage requirements are high, and computational efficiency is low; (2) The retrieval method of large-scale data has high requirements on speed, time and the like.
In the prior art, two methods are mainly adopted for image retrieval. One is retrieval based on image overall feature description, because the feature dimension is high, all speed influences of storage, operation, retrieval and the like are caused; the other method is to perform retrieval based on local image features, and although the method can accurately describe the local image features, the description of the whole image is lost, so that the retrieval precision is not high.
Therefore, how to provide an image retrieval method to improve the retrieval accuracy and speed is an urgent problem to be solved in the field of computer vision.
Disclosure of Invention
The invention aims to provide an image retrieval method based on deep hash characteristics and heterogeneous parallel processing, and solves the problem of low image retrieval precision in the prior art.
The technical scheme adopted by the invention is that the image retrieval method based on the deep hash feature and heterogeneous parallel processing is implemented according to the following steps:
step 1, off-line training network model
The method comprises the steps of adopting a GoogLeNet network model as an initialization network structure, replacing the last classification layer with a Hash layer, wherein the unit number of the Hash layer is the bit number of an image to be coded, obtaining the GoogLeNet-1 network model, dividing an image data set CIFAR-10 into a training set and a testing set, wherein the training set is divided into 10 types, each type comprises 5000 pieces, and the testing set is divided into 10 types, and each type comprises 1000 pieces.
Inputting the training set into a GoogLeNet-1 network model, extracting image depth features through a convolutional layer, simultaneously learning a hash function, mapping the final depth features through the hash layer to obtain corresponding binary hash codes, and then performing iterative optimization and updating on a loss function to obtain optimal network parameters and a final GoogLeNet-hash model;
step 2, sending the test set and the query image into a trained GoogLeNet-hash network model to obtain the depth hash characteristics of the test set and the query image, namely binary hash codes;
step 3, calculating Hamming distances between the binary Hash codes of the test set and the query image obtained in the step 2, and sequencing the Hamming distances in an ascending order to obtain a primary ranking result;
and 4, selecting binary Hash codes of the previous p images in the initial ranking result, performing Hamming distance calculation again with the binary codes of the query image, and sequencing according to the Hamming distance in an ascending order to obtain a reordering result, namely finally obtaining q retrieval results (q is less than p) which are most similar to the query image.
The present invention is also characterized in that,
the process of generating the binary hash code in the hash layer in the step 1 and the step 2 specifically comprises the following steps:
after an m-dimensional image depth feature x is obtained from a full-connection layer of a GoogleLeNet-hash network model, the x is transmitted to a hash layer, q hash functions are provided on the assumption that the number of nodes of the hash layer is q, q bit hash codes are generated, and the hash codes generated by the q hash functions are shown in the following formula:
(h 1 ,h 2 ,...,h q ) T =(sigmoid(W 1 x),sigmoid(W 2 x)...,sigmoid(W q x)) T (1)
wherein h is 1 -h q For hash coding of bits 1 to q, sigmoid (W) 1 x)-sigmoid(W q x) is the 1 st to q th Hash codes relaxed by sigmoid function, W 1 -W q To construct q m-dimensional random vector matrices, W 1 -W q ∈R q *m ,W 1 -W q Is generated from a gaussian distribution;
quantizing the relaxed Hash code to obtain a final binary Hash code H, namely H = { H = 1 ,h 2 ,...,h q } T Thresholding is performed, and the final binary hash code is obtained by the following formula:
Figure BDA0003092503750000031
that is, the binary hash code H is a code consisting of 0 and 1.
In the step 1, iterative optimization and updating are performed on the loss function to obtain the optimal network parameters and the final deep hash network model google lenet-hash, which specifically include:
step 1.1, calculating the probability of each image in the training set belonging to each category;
Figure BDA0003092503750000032
wherein Z is k Representing the image features after hash-layer weighting, n representing the number of image classes, f (Z) k ) Representing the probability of an image belonging to each class, Z i Representing the ith class, where 1 < = i < = n, k is the class of the image true;
step 1.2, according to f (Z) k ) Calculating the value of the Loss function Loss:
Loss=-logf(Z k ) (4)
step 1.3, solving the optimal value of Loss, and updating the weight coefficient theta by adopting a gradient descent method:
Figure BDA0003092503750000033
Figure BDA0003092503750000034
θ=θ-η(f(Z k )-1+γθ) (7)
wherein gamma is an attenuation factor, and eta is a learning rate, so that correction of the Softmax classifier and updating of network parameters are completed, and a final deep Hash network model GoogLeNet-hash is obtained.
The characteristic extraction in the step 2 is to input the image into a deep hash network GoogLeNet-hash to extract binary hash characteristics of the image and carry out thresholding to finally obtain a characteristic set, and the specific steps are as follows:
i.e. given test set ψ = { I = { (I) 1 ,I 2 ,...,I g In which I g Representing the g-th image in the test set, inputting the image in the test set into a deep hash network model GoogLeNet-hash to extract image hash characteristics and thresholding the image hash characteristics to obtain a final characteristic set psi H ={H 1 ,H 2 ,...,H g In which H is g ={0,1} q
Given a query image I k To query an image I k Inputting the data into a deep hash network model GoogLeNet-hash to extract image hash characteristics and thresholding the image hash characteristics to obtain binary hash codes H of the image k
Wherein H g And H k According to H = { H 1 ,h 2 ,...,h q } T And then thresholding H according to the formula (3).
The step 3 specifically comprises the following steps:
computing a query image I k Binary hash coding of H k Set psi of binary hashes corresponding to test set images H ={H 1 ,H 2 ,...,H g H is encoded by each binary hash of g The initial search result sequence is obtained according to the distance ascending sequence.
When calculating Hamming distance, the binary Hash code H k And binary hash encoding H n Comparing each bit, and obtaining the corresponding Hamming distance by comparing whether each bit of the Hash code is the same or not and adding 1 to the Hamming distance if the Hash code is different.
At the CPU end of the central processing unitLine query image I k Binary hash coding of H k Set psi of binary hashes corresponding to test set images H ={H 1 ,H 2 ,...,H g Acquisition of H to be acquired k And psi H ={H 1 ,H 2 ,...,H g And transmitting the results to a GPU end of an image processor, calculating the Hamming distance, sequencing the results from small to large according to the Hamming distance after calculation to obtain initial arrangement results, and transmitting the initial arrangement results to a CPU end.
The step 4 specifically comprises the following steps: and the CPU calculates the Hamming distance between the image and the binary Hash codes of the query image again to obtain a reordering result, namely q images q which are most similar to the query image are less than p to obtain a final retrieval result.
60000 CIFAR-10 data sets, 10 training sets in each class and 5000 training sets in each class, and 10 testing sets in each class and 1000 testing sets in each class.
The invention has the beneficial effects that:
the invention combines a deep learning network and a Hash algorithm to form an end-to-end deep Hash network model, then extracts binary Hash codes of CIFAR-10 images as feature indexes, accelerates the retrieval speed by introducing GPU parallel retrieval to carry out feature matching and distance measurement, and finally improves the precision of the final retrieval result by utilizing result rearrangement.
Drawings
FIG. 1 is a flow chart of an image retrieval method based on deep hash feature and heterogeneous parallel processing according to the present invention;
fig. 2 is a schematic diagram of a CPU + GPU heterogeneous parallel processing structure in the image retrieval method based on the deep hash feature and heterogeneous parallel processing of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention discloses an image retrieval method based on deep hash characteristics and heterogeneous parallel processing, the flow of which is shown in figure 1 and is specifically implemented according to the following steps:
step 1, off-line training network model
Adopting a GoogLeNet network model as an initialization network structure, replacing the last classification layer with a hash layer, wherein the unit number of the hash layer is the bit number of an image to be coded, obtaining a GoogLeNet-1 network model, dividing an image data set CIFAR-10 into a training set and a test set, wherein the CIFAR-10 data set is 60000 in total, the training set is divided into 10 types, each type is 5000, the test set is divided into 10 types, and each type is 1000, inputting the training set into the GoogLeNet-1 network model, extracting image depth characteristics through a convolutional layer, simultaneously performing hash function learning, mapping the final depth characteristics through the hash layer to obtain corresponding binary hash codes, and then performing iterative optimization and updating on a loss function to obtain optimal network parameters and a final GoogLeNet-hash model;
step 2, sending the test set and the query image into a trained GoogLeNet-hash network model to obtain the depth hash characteristics of the test set and the query image, namely binary hash codes;
according to the method, the hash layer is designed, and the parameter values of the hash function are learned from the training data to generate the more compact hash features. After image depth features are obtained from a full connection layer of a GoogLeNet-hash network model, the depth features are transmitted into a hash layer to generate binary hash codes;
the process of generating the binary hash code in the hash layer in the step 1 and the step 2 specifically comprises the following steps:
supposing that m-dimensional image depth features x are obtained from a full connection layer of a GoogLeNet-hash network model, then transmitting x into a hash layer, supposing that the number of nodes of the hash layer is q, namely q hash functions exist, generating q-bit hash codes, wherein the hash codes generated by the q hash functions are shown in the following formula:
(h 1 ,h 2 ,...,h q ) T =(sgn(W 1 x),sgn(W 2 x)...,sgn(W q x)) T (1)
since the sgn function is not a convex function and the objective function cannot be optimized and solved by using a gradient-based method, the sigmoid function is selected for relaxation, the coding range is constrained to the (0, 1) interval, and the final Hash codes generated by q Hash functions are obtained as shown in the following formula:
(h 1 ,h 2 ,...,h q ) T =(sigmoid(W 1 x),sigmoid(W 2 x)...,sigmoid(W q x)) T (2)
wherein h is 1 -h q For hash coding of bits 1 to q, sigmoid (W) 1 x)-sigmoid(W q x) is the 1 st to q th Hash codes relaxed by sigmoid function, W 1 -W q To construct q m-dimensional random vector matrices, W 1 -W q ∈R q *m ,W 1 -W q Is generated from a gaussian distribution;
quantizing the relaxed Hash code to obtain a final binary Hash code H, namely H = { H = 1 ,h 2 ,...,h q } T Thresholding is performed, and the final binary hash code is obtained by the following formula:
Figure BDA0003092503750000071
that is, the binary hash code H is a code consisting of 0 and 1;
the method comprises the following steps of obtaining optimal network parameters and a final GoogLeNet-hash through iterative optimization and updating of a loss function, wherein the step of obtaining the optimal network parameters and the final GoogLeNet-hash is specifically as follows:
step 1.1, calculating the probability of each image in the training set belonging to each category;
Figure BDA0003092503750000072
wherein Z is k Representing the image features after weighting by the hash layer, n representing the number of image classes, f (Z) k ) Representing the probability of an image belonging to each class, Z i Representing the ith class, where 1 < = i < = n, k is the class of the image true;
step 1.2, according to f (Z) k ) Calculating the value of the Loss function Loss:
Loss=-logf(Z k ) (5)
step 1.3, solving the optimal value of Loss, and updating a weight coefficient theta by adopting a gradient descent method:
Figure BDA0003092503750000073
Figure BDA0003092503750000074
θ=θ-η(f(Z k )-1+γθ) (8)
wherein gamma is an attenuation factor, eta is a learning rate, so that correction of the Softmax classifier and updating of network parameters are completed, and a final deep hash network model GoogLeNet-hash is obtained;
the hash layer also belongs to a hidden layer of a neural network, the number of neurons of the hidden layer is not specifically determined, and the number of nodes of the hash layer designed in the invention determines the length of the binary coding features of the image, so that the number of nodes of the hash layer can be finally determined by comparing the training speed of different node numbers with the precision of the binary coding during retrieval through experiments.
The characteristic extraction in the step 2 is to input the image into a deep hash network GoogLeNet-hash to extract binary hash characteristics of the image and carry out thresholding to finally obtain a characteristic set, and the specific steps are as follows:
i.e. given test set ψ = { I 1 ,I 2 ,...,I g In which I g Representing the g-th image in the test set, inputting the image in the test set into a deep hash network model GoogLeNet-hash to extract image hash characteristics and thresholding the image hash characteristics to obtain a final characteristic set psi H ={H 1 ,H 2 ,...,H g In which H is g ={0,1} q
Given a query image I k To query an image I k Inputting the image into a deep Hash network model GoogLeNet-hash to extract image Hash characteristics and thresholding the image Hash characteristics to obtain a binary Hash code of the imageCode H k
Wherein H g And H k According to H = { H 1 ,h 2 ,...,h q } T And then thresholding H according to the formula (3).
Step 3, calculating Hamming distances between the binary hash codes of the test set and the query image obtained in the step 2, and sequencing the Hamming distances in an ascending order to obtain an initial arrangement result; the method specifically comprises the following steps:
the method specifically comprises the following steps:
computing a query image I k Binary hash coding of H k Set psi of binary hashes corresponding to test set images H ={H 1 ,H 2 ,...,H g Each binary hash code H in g The initial search result ordering is obtained according to the distance ascending order of the Hamming distances.
When calculating Hamming distance, the binary Hash code H k And binary hash encoding H n Comparing each bit of the hash code, and if the hash code is different from the hash code, adding 1 to the hamming distance, for example, there are 3 bits different between 10001001 and 10110001. And if the Hamming distance is 3, obtaining the corresponding Hamming distance, and if the Hamming distance is larger, the difference between the query image and the test set image is larger, namely the similarity is lower. And (4) sorting the Hamming distance from small to large, namely sorting similar images.
As shown in FIG. 2, the image I is inquired at the CPU end of the central processing unit k Binary hash coding of H k Set psi of binary hashes corresponding to test set images H ={H 1 ,H 2 ,...,H g Acquisition of H to be acquired k And psi H ={H 1 ,H 2 ,...,H g Transmitting the result to a GPU (graphics processing Unit) end of an image processor, calculating the Hamming distance, sequencing the calculated Hamming distance from small to large to obtain a primary arrangement result, and transmitting the primary arrangement result to a CPU (Central processing Unit) end;
and 4, selecting the binary hash codes of the previous p images in the initial arrangement result, calculating the Hamming distance between the binary hash codes and the binary hash codes of the query image by the CPU, and sequencing according to the Hamming distance in an ascending order to obtain a reordering result, namely finally obtaining q retrieval results (q is less than p) which are most similar to the query image.
The invention utilizes the deep neural network to extract the characteristics of the image, and the network structure has important influence on the training. Too complicated training difficulty of network structure is big, can appear fitting, and the structure is too simple, can not arouse the learning ability of network. The GoogleLeNet network is selected, the number of layers of the network is increased, loss is increased at different depths to avoid the problem of gradient disappearance, and convolution kernels of different sizes are spliced to achieve the advantage of fusion of features of different scales.
The large-scale image retrieval based on the deep hash feature and heterogeneous parallel processing can be divided into four parts as shown in fig. 1. Respectively as follows: the system comprises a network model training part, an image feature extracting part, a parallel processing and calculating part and a retrieval result reordering part. The training network model part is to replace the last full connection layer of the GoogLeNet to a GoogLeNet-1 network model of a Hash layer, and then obtain a final deep Hash network model GoogLeNet-hash through Hash learning and parameter optimization; the image feature extraction part adopts a pre-trained network model to extract the depth features of the test set image and the query image; the parallel processing and calculating part utilizes the strong data processing capacity of the GPU, divides a thread to calculate the Hamming distance between the query image and the binary Hash codes of the test set image, and carries out similarity sorting according to the distance, wherein the smaller the distance is, the more similar the distance is; the search result rearrangement part is a method for improving the search precision, and obtains a final rearrangement result and the most image q images by calculating the Hamming distance twice.
The invention relates to a large-scale image retrieval method based on deep hash characteristics and heterogeneous parallel processing.A deep hash network model GoogLeNet-hash is obtained through a training set from the aspect of function execution; secondly, extracting binary Hash coding characteristics of the image by adopting a pre-trained deep Hash network model; then, extracting and matching the characteristics of the query image, executing CPU + GPU heterogeneous parallel processing, calculating the Hamming distance of binary hash codes of the query image and the test set image by thread, and obtaining an initial sequencing result based on the Hamming distance; and finally, reordering the execution results, and improving the retrieval precision through secondary Hamming distance calculation to obtain q images most similar to the query image. The large-scale image retrieval method based on the depth hash characteristics and heterogeneous parallel processing fully utilizes the depth characteristics of the image and the simplicity of binary hash codes, and combines the strong data processing capacity of the GPU to realize rapid and accurate large-scale image retrieval.

Claims (5)

1. An image retrieval method based on deep hash characteristics and heterogeneous parallel processing is characterized by comprising the following steps:
step 1, off-line training network model
Adopting a GoogLeNet network model as an initialization network structure, replacing the last classification layer with a Hash layer, wherein the unit number of the Hash layer is the bit number of an image to be coded, obtaining a GoogLeNet-1 network model, dividing an image data set CIFAR-10 into a training set and a testing set, wherein the training set and the testing set respectively comprise a plurality of classes of images, inputting the training set into the GoogLeNet-1 network model, extracting image depth characteristics through a convolutional layer, simultaneously performing Hash function learning, mapping the final depth characteristics through the Hash layer to obtain corresponding binary Hash codes, and then performing iterative optimization and updating on a loss function to obtain optimal network parameters and a final GoogLeNet-hash of the deep Hash network model;
step 2, sending the test set and the query image into a trained GoogLeNet-hash network model to obtain the depth hash characteristics of the test set and the query image, namely binary hash codes; the characteristic extraction in the step 2 is to input the image into a deep hash network GoogLeNet-hash to extract binary hash characteristics of the image and carry out thresholding to finally obtain a characteristic set, and specifically comprises the following steps:
i.e. given test set ψ = { I 1 ,I 2 ,...,I g In which I is g Representing the g image in the test set, inputting the image in the test set into a deep hash network model GoogLeNet-hash, extracting image hash characteristics and thresholding to obtainFinal feature set psi H ={H 1 ,H 2 ,...,H g In which H is g ={0,1} q
Given a query image I k To query an image I k Inputting the image hash characteristics into a deep hash network model GoogLeNet-hash, extracting the image hash characteristics and thresholding the image hash characteristics to obtain a binary hash code H of the image k
Wherein H g And H k As H = { H 1 ,h 2 ,...,h q } T Then thresholding H according to a formula (3) to obtain the threshold value;
step 3, calculating Hamming distances between the binary Hash codes of the test set and the query image obtained in the step 2, and sequencing the Hamming distances in an ascending order to obtain a primary ranking result; the method specifically comprises the following steps: computing a query image I k Binary hash coding of H k Binary hash coding set psi corresponding to test set image H ={H 1 ,H 2 ,...,H g H is encoded by each binary hash of g The Hamming distances are arranged according to the ascending order of the distances to obtain the ordering of the initial retrieval results;
image I inquiry at CPU k Binary hash coding of H k Set psi of binary hashes corresponding to test set images H ={H 1 ,H 2 ,...,H g Acquisition of H to be acquired k And psi H ={H 1 ,H 2 ,...,H g Transmitting the result to a GPU (graphics processing Unit) end of an image processor, calculating the Hamming distance, sequencing the calculated Hamming distance from small to large to obtain a primary arrangement result, and transmitting the primary arrangement result to a CPU (Central processing Unit) end;
step 4, selecting binary Hash codes of the previous p images in the initial ranking result, performing Hamming distance calculation again with the binary codes of the query image, and obtaining a reordering result according to the ascending ordering of the Hamming distances, namely finally obtaining q retrieval results most similar to the query image, wherein q is less than p;
the process of generating the binary hash code in the hash layer in the step 1 and the step 2 specifically comprises the following steps:
after an m-dimensional image depth feature x is obtained from a full-connection layer of a GoogleLeNet-hash network model, the x is transmitted to a hash layer, q hash functions are provided on the assumption that the number of nodes of the hash layer is q, q bit hash codes are generated, and the hash codes generated by the q hash functions are shown in the following formula:
(h 1 ,h 2 ,...,h q ) T =(sigmoid(W 1 x),sigmoid(W 2 x)...,sigmoid(W q x)) T (2)
wherein h is 1 -h q For hash coding of bits 1 to q, sigmoid (W) 1 x)-sigmoid(W q x) is the 1 st to q th Hash codes relaxed by sigmoid function, W 1 -W q To construct q m-dimensional random vector matrices, W 1 -W q ∈R q*m ,W 1 -W q Is generated from a gaussian distribution;
quantizing the relaxed hash code to obtain a final binary hash code H, i.e. for H = { H = 1 ,h 2 ,...,h q } T Thresholding is performed, and the final binary hash code is obtained by the following formula:
Figure FDA0003925039900000031
that is, the binary hash code H is a code consisting of 0 and 1.
2. The image retrieval method based on the deep hash feature and the heterogeneous parallel processing as claimed in claim 1, wherein the step 1 further comprises performing iterative optimization and updating on the loss function to obtain the optimal network parameters and the final deep hash network model google net-hash specifically as follows:
step 1.1, calculating the probability of each image in the training set belonging to each category;
Figure FDA0003925039900000032
wherein, Z k Representing the image features after weighting by the hash layer, n representing the number of image classes, f (Z) k ) Representing the probability of an image belonging to each class, Z i Represents the ith category, wherein 1<=i<= n, k is the category of the image truth;
step 1.2, according to f (Z) k ) Calculating the value of the Loss function Loss:
Loss=-log f(Z k ) (5)
step 1.3, solving the optimal value of Loss, and updating the weight coefficient theta by adopting a gradient descent method:
Figure FDA0003925039900000033
Figure FDA0003925039900000034
θ=θ-η(f(Z k )-1+γθ) (8)
wherein gamma is an attenuation factor, and eta is a learning rate, so that correction of the Softmax classifier and updating of network parameters are completed, and a final deep hash network model GoogLeNet-hash is obtained.
3. The image retrieval method based on the deep hash feature and heterogeneous parallel processing as claimed in claim 1, wherein the binary hash code H is used for computing the Hamming distance k And binary hash coding H n Comparing each bit, and obtaining the corresponding Hamming distance by comparing whether each bit of the Hash code is the same or not and adding 1 to the Hamming distance if the Hash code is different.
4. The image retrieval method based on the deep hash feature and the heterogeneous parallel processing according to claim 1, wherein the step 4 specifically comprises: and the CPU calculates the Hamming distance between the image and the binary Hash codes of the query image again to obtain a reordering result, namely q images q < p which are most similar to the query image, and a final retrieval result is obtained.
5. The image retrieval method based on the deep hash feature and the heterogeneous parallel processing as claimed in claim 1, wherein the CIFAR-10 data sets are 60000 in total, the training sets are classified into 10 categories, each category is 5000 categories, and the testing sets are classified into 10 categories, each category is 1000 categories.
CN202110600390.6A 2021-05-31 2021-05-31 Image retrieval method based on deep hash feature and heterogeneous parallel processing Expired - Fee Related CN113326393B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110600390.6A CN113326393B (en) 2021-05-31 2021-05-31 Image retrieval method based on deep hash feature and heterogeneous parallel processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110600390.6A CN113326393B (en) 2021-05-31 2021-05-31 Image retrieval method based on deep hash feature and heterogeneous parallel processing

Publications (2)

Publication Number Publication Date
CN113326393A CN113326393A (en) 2021-08-31
CN113326393B true CN113326393B (en) 2023-04-07

Family

ID=77422601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110600390.6A Expired - Fee Related CN113326393B (en) 2021-05-31 2021-05-31 Image retrieval method based on deep hash feature and heterogeneous parallel processing

Country Status (1)

Country Link
CN (1) CN113326393B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407352A (en) * 2016-09-06 2017-02-15 广东顺德中山大学卡内基梅隆大学国际联合研究院 Traffic image retrieval method based on depth learning
CN109918532A (en) * 2019-03-08 2019-06-21 苏州大学 Image search method, device, equipment and computer readable storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105512273A (en) * 2015-12-03 2016-04-20 中山大学 Image retrieval method based on variable-length depth hash learning
CN106503106B (en) * 2016-10-17 2019-10-18 北京工业大学 A kind of image hash index construction method based on deep learning
CN107016708B (en) * 2017-03-24 2020-06-05 杭州电子科技大学 Image hash coding method based on deep learning
CN107423376B (en) * 2017-07-10 2019-12-27 上海媒智科技有限公司 Supervised deep hash rapid picture retrieval method and system
CN108920720B (en) * 2018-07-30 2021-09-07 电子科技大学 Large-scale image retrieval method based on depth hash and GPU acceleration
CN109241313B (en) * 2018-08-14 2021-11-02 大连大学 Image retrieval method based on high-order deep hash learning
US11556581B2 (en) * 2018-09-04 2023-01-17 Inception Institute of Artificial Intelligence, Ltd. Sketch-based image retrieval techniques using generative domain migration hashing
CN109241317B (en) * 2018-09-13 2022-01-11 北京工商大学 Pedestrian Hash retrieval method based on measurement loss in deep learning network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407352A (en) * 2016-09-06 2017-02-15 广东顺德中山大学卡内基梅隆大学国际联合研究院 Traffic image retrieval method based on depth learning
CN109918532A (en) * 2019-03-08 2019-06-21 苏州大学 Image search method, device, equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN113326393A (en) 2021-08-31

Similar Documents

Publication Publication Date Title
WO2020182019A1 (en) Image search method, apparatus, device, and computer-readable storage medium
Zhang et al. Improved deep hashing with soft pairwise similarity for multi-label image retrieval
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN111753189B (en) Few-sample cross-modal hash retrieval common characterization learning method
CN109783682B (en) Point-to-point similarity-based depth non-relaxed Hash image retrieval method
CN113177132B (en) Image retrieval method based on depth cross-modal hash of joint semantic matrix
CN108038122B (en) Trademark image retrieval method
CN109815801A (en) Face identification method and device based on deep learning
CN110222218B (en) Image retrieval method based on multi-scale NetVLAD and depth hash
CN109766469B (en) Image retrieval method based on deep hash learning optimization
CN111125411B (en) Large-scale image retrieval method for deep strong correlation hash learning
CN104199923B (en) Large-scale image library searching method based on optimal K averages hash algorithm
CN104112018B (en) A kind of large-scale image search method
CN112732864B (en) Document retrieval method based on dense pseudo query vector representation
CN108304573A (en) Target retrieval method based on convolutional neural networks and supervision core Hash
CN114358188A (en) Feature extraction model processing method, feature extraction model processing device, sample retrieval method, sample retrieval device and computer equipment
CN111026887B (en) Cross-media retrieval method and system
CN114118369B (en) Image classification convolutional neural network design method based on group intelligent optimization
CN111008224A (en) Time sequence classification and retrieval method based on deep multitask representation learning
CN113806580B (en) Cross-modal hash retrieval method based on hierarchical semantic structure
CN112860930A (en) Text-to-commodity image retrieval method based on hierarchical similarity learning
CN112163114B (en) Image retrieval method based on feature fusion
CN113836896A (en) Patent text abstract generation method and device based on deep learning
CN114926742B (en) Loop detection and optimization method based on second-order attention mechanism
CN115795065A (en) Multimedia data cross-modal retrieval method and system based on weighted hash code

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230303

Address after: 518000 301, Feiyada Science and Technology Building, No. 002, Gaoxin South 1st Road, High-tech Zone Community, Yuehai Street, Nanshan District, Shenzhen, Guangdong Province

Applicant after: Shenzhen foresight Information Co.,Ltd.

Address before: 710000 No. B49, Xinda Zhongchuang space, 26th Street, block C, No. 2 Trading Plaza, South China City, international port district, Xi'an, Shaanxi Province

Applicant before: Xi'an Huaqi Zhongxin Technology Development Co.,Ltd.

Effective date of registration: 20230303

Address after: 710000 No. B49, Xinda Zhongchuang space, 26th Street, block C, No. 2 Trading Plaza, South China City, international port district, Xi'an, Shaanxi Province

Applicant after: Xi'an Huaqi Zhongxin Technology Development Co.,Ltd.

Address before: 710048 Shaanxi province Xi'an Beilin District Jinhua Road No. 5

Applicant before: XI'AN University OF TECHNOLOGY

GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20230407