CN113191445B - Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm - Google Patents
Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm Download PDFInfo
- Publication number
- CN113191445B CN113191445B CN202110531130.8A CN202110531130A CN113191445B CN 113191445 B CN113191445 B CN 113191445B CN 202110531130 A CN202110531130 A CN 202110531130A CN 113191445 B CN113191445 B CN 113191445B
- Authority
- CN
- China
- Prior art keywords
- image
- hash
- generator
- hash code
- encoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Abstract
The invention provides a large-scale image retrieval method based on an automatic supervision countercheck hash algorithm. The invention provides a new Hash learning framework, which is called self-supervision counterattack Hash; the framework primarily learns discriminative hash codes using image rotation based self-supervised similarity metrics and generation countermeasure networks. The neural network model mainly comprises an encoder for acquiring the hash code, a generator for generating a pseudo image and a discriminator for distinguishing a true image from a false image; a loss function consisting of approximate semantic similarity loss, feature loss and antagonism loss is designed to maintain the similarity between the image and the hash code. Adding self-supervision characteristics into the whole model, neglecting bottom-layer semantic information and keeping high-layer semantic information; particularly for short hash codes, the high-level semantic information of the image can be better maintained. Experimental results show that compared with the conventional retrieval method, the image retrieval method provided by the invention has better image retrieval performance.
Description
Technical Field
The invention belongs to the technical field of deep learning, and particularly relates to a large-scale image data retrieval method based on an automatic supervision counterattack hash algorithm.
Background
The hash algorithm is concerned more and more in solving the problem of large-scale image retrieval due to low storage requirement and high search efficiency; and can be divided into supervised hashing and unsupervised hashing according to whether an image label is used, and the supervised hashing method generally has better performance than the unsupervised hashing method. However, in most cases, there is no label information in the dataset that is useful for the image, and manual labeling requires a lot of manpower. To address this problem, many researchers have attempted to improve the process. For example, Gidaris et al propose an image rotation based self-supervision method; however, this may result in different feature representations of the images before and after rotation. Although Misra et al solved this problem, they did not map the similarity matrix of similar images in the original space to the feature space.
With the rise of deep learning, deep learning algorithms can be divided into two categories: supervised learning and unsupervised learning. Supervised learning algorithms are favored by people with their high accuracy. However, manually marked labels are not readily available and require significant human resources. Therefore, in recent years, unsupervised learning algorithms have received increasing attention. The self-supervised learning is a popular choice in the unsupervised learning, and the popularity thereof is inevitable. After all mainstream supervised learning tasks mature, data becomes the most important bottleneck. Learning effective information from unlabeled data is a very important research subject, and self-supervision learning provides very rich imagination.
Disclosure of Invention
The invention aims to provide a large-scale image retrieval method based on an automatic supervision countermeasure hash algorithm to make up for the defects of the prior art.
In order to achieve the purpose, the invention adopts the following specific technical scheme:
a large-scale image retrieval method based on an automatic supervision countermeasure hash algorithm comprises the following steps:
s1: acquiring image data comprising a training set and a test set;
s2: optimizing the encoder by utilizing the training set;
s3: rotating the test set image data, and inputting the test set image data into an optimized S2 encoder to obtain a hash code;
s4: and calculating the Hamming distance between the hash code obtained in the step S3 and the hash code of the training set of the step S2, sorting the Hamming distances from small to large, outputting the first k search results, and completing the search.
Further, in S2: the encoder uses a structure similar to VGG19, including five convolutional layers, two fully-connected layers, and one hash layer; for feature comparison, a full connection layer is added at the end; by utilizing the relation between image neighborhood structures, namely the relation between the Hash code B and the semantic similarity matrix, the following objective function l is providedsTo learn hash codes to approximate as closely as possible the raw data distribution in projection space:
wherein, L is the length of the hash code, S is a similarity matrix, E represents that the objective function optimizes an encoder which is used for generating the hash codeOptimization of lsSimilar images in the original space can be made to have similar hash codes when mapped to the hash space.
Further, the encoder optimization in S2 specifically includes:
s2-1: obtaining the feature vectors of the images of the training set, calculating the cosine distances between the images and sequencing to obtain a similarity ranking;
s2-2: analyzing the similarity ranking and setting a threshold value to obtain a similarity matrix;
s2-3: rotating the training set image and inputting the image into the encoder to obtain a hash code;
s2-4: inputting the hash code into a generator to obtain a pseudo image;
s2-5: inputting the pseudo image and the real image into a discriminator at the same time for confrontation training;
s2-6: optimizing the encoder, the generator and the discriminator according to an objective function; the optimized encoder, generator and discriminator form an auto-supervised countermeasure hash algorithm.
Further, the S2-1 is specifically: for database pointsFeature vectors are extracted from pool5 layer of VGG model by using k-nearest neighbor (KNN) methodAnd calculating cosine distances between the two groups, and sequencing the two groups in the order from small to large to obtain similarity ranking.
Further, in the S2-2: setting the K1 range as the neighborhood according to the cosine similarity of each image, and obtaining the initial matrix S1,S1The calculation is as follows:
wherein x isiAnd xjFor the feature vector of the image, K1-NN is xiK1 nearest neighbors of S1On the basis of (1), comparing S1And corresponding column of (1) and useStructure S2As follows:
wherein K2-NN is xiK2 nearest neighbors and finally, combining these two matrices into S, is calculated as follows:
further, in the S2-4: the generator consists of a full-link layer and four deconvolution layers, and is generatedIn the device, a hash code is input as 'noise' to generate a new image; specifically, the hash code is input into a fully concatenated layer of size 8 × 8 × 256, and then 3 deconvolution layers of 5 × 5 and 1 × 1 are used, the number of kernels being 256, 128, 32, and 3, respectively; for the image I generated by the generatorGAnd an original image I, and an objective function l is provided between the feature vectorsf1(ii) a The objective function is defined as follows:
where Ψ (-) denotes the convolution-activated feature vector, w and h denote the sizes of the corresponding features, and D denotes the adjustment of the parameters in the arbiter; the generator generates a new image by using the hash code of the rotated image, so that a larger difference exists between the feature vector of the new image and the original image in consideration of low-level semantic information in the image; based on this problem, the image I after rotationRAn objective function is arranged between the feature vectors of the original image I and the feature vectors of the original image I, the objective function being to ensure that the feature vectors of the same image are as similar as possible irrespective of rotation, thereby reducing the new image IRAnd a rotated image I obtained from the encoderGThe feature vector of the original image is obtained from the discriminator. Therefore, we use this loss function to optimize the encoder and the discriminator; so the finally set objective function lf2The following were used:
wherein, IRThe image is rotated, and I is an original image; the method aims to ignore semantic information of a lower layer, so that the loss of characteristics of the lower layer is not considered. Wherein lf=lf1+γlf2And gamma is a weight parameter.
Further, in the S2-5: according to the structure of counterstudy, a discriminator D is arranged to judge whether the image is true or false, so as to optimize the generator G; the optimization of the generation of the countermeasure network is a very small value game problem; the discriminator D is composed of four convolution layers, the number of the cores is 32, 128, 256 and 256, a full connection layer is arranged next to the cores, the size of the full connection layer is 1024, and eLU is used as an activation function; the generative model is essentially a maximum likelihood estimate, which is used to generate a model of the particular distribution data; the function of the generated model is to capture the distribution of sample data, and convert the distribution of the original input information into a sample with designated distribution through the transformation of parameters in the maximum likelihood estimation; using a standard objective function for generating a countermeasure network, the formula is as follows:
wherein G represents adjusting parameters in the generator, D (-) represents the output of the last layer of the discriminator, and G (-) represents the output of the last layer of the generator; in order to make the effect of generating the antagonistic network more obvious, random noise is input into the generator, and the generator and the discriminator are optimized by using the following loss function:
where z is random noise, ld=ld′+ld”。
Further, in the S2-6: the sum of the four parts of the loss function is set, and the two weights α and β of the loss function are set, the overall loss function, i.e. the objective function, is as follows:
L=ls+αlf+βld (9);
using the above (9) to perform network optimization, and adding a hash layer after the last complete connection layer to obtain hash code, learning parameters is requiredθ、ηAnd ξ, the calculation process is as follows;
whereinIndicates the input rotated picture, beta,And θ represents a parameter in the encoder network;
IG=G(bi;η) (11)
where η represents a parameter in the generator;
where ξ represents a parameter in the discriminator; the objective function is optimized using back-propagation and random gradient descent algorithms.
The invention has the advantages and technical effects that:
the invention provides a new Hash learning framework, which is called self-supervision counterattack Hash; the framework primarily learns discriminative hash codes using image rotation based self-supervised similarity metrics and generation countermeasure networks (GANs). The neural network model mainly comprises an encoder for acquiring a hash code, a generator for generating a pseudo image and a discriminator for distinguishing a true image from a false image; a loss function consisting of approximate semantic similarity loss, feature loss and antagonism loss is designed to maintain the similarity between the image and the hash code. Adding self-supervision characteristics in the whole model, neglecting bottom semantic information and keeping high-level semantic information; particularly for short hash codes, the high-level semantic information of the image can be better maintained.
And the experimental result shows that compared with the conventional retrieval method, the image retrieval method provided by the invention has better image retrieval performance.
Drawings
FIG. 1 is a diagram illustrating the process of the present invention for self-supervised countermeasure hashing.
Fig. 2 is an effect diagram of the similarity matrix S generated by the present invention.
FIG. 3 is a graph comparing the effect of the weight parameter γ on the mean of accuracy (mAP) of different bits in the present invention.
FIG. 4 is a comparison graph of the results of whether pixel loss is considered in the loss function of the present invention.
Detailed Description
The invention will be further explained and illustrated by means of specific embodiments and with reference to the drawings.
Example 1:
a large-scale image retrieval method based on an auto-supervision countermeasure hash algorithm comprises the following steps (as shown in figure 1):
step 1: firstly, extracting a Semantic similarity matrix S (such as a Semantic similarity matrix part in FIG. 1 and FIG. 2) from a data set;
step 2: rotating the image to a certain angle, inputting the image into an Encoder to obtain a hash code (such as an Encoder part in fig. 1);
and step 3: inputting the hash code into the Generator to obtain a new image (as in the Generator section of FIG. 1);
and 4, step 4: the original image and the new image are input to a Discriminator for confrontation recognition (see the Discriminator section of fig. 1).
And 5: and optimizing the network according to the objective function. The performance (table 1) and training time (table 2) of the self-supervised counterhash algorithm (SHGan) and several hash algorithms (iterative quantization (ITQ), Locality Sensitive Hash (LSH), Spectral Hash (SH), Spherical hash (Spherical hash), deep binary hash (DeepBit), Deep Hash (DH), binary counterhash (BGAN)) on the cifar-10 dataset are shown in the following table:
TABLE 1 mean of average precision results on cifar-10 data set after 90 degree rotation of the image
TABLE 2 training and testing time for deep Hash, binary countermeasure Hash, and self-supervision countermeasure Hash
TABLE 3 average precision mean of 12-bit hash codes obtained by 90 and 180 degree rotation in self-supervised countermeasure hashing
Rotation angle | mAP |
90 | 0.495 |
180 | 0.510 |
Example 2:
the embodiment 1 comprises the following concrete steps:
step 1: for database pointsFeature vectors are extracted from pool5 layer of VGG model by using k-nearest neighbor (KNN) methodAnd calculating cosine distances between them, and sorting in order from small to large. Setting the K1 range as its neighborhood according to the cosine similarity of each image, and obtaining the initial matrix S1,S1The calculation is as follows:
wherein x isiAnd xjFor the feature vector of the image, K1-NN is xiK1 nearest neighbors of (S)1Based on (2), compare S1And corresponding column of (1) and useStructure S2As follows:
wherein K2-NN is xiK2 nearest neighbors and finally, combining these two matrices into S, is calculated as follows:
step 2: the picture is rotated to a certain angle and then input into the encoder E, which uses a structure similar to VGG19, including five convolutional layers, two fully-connected layers, and one hash layer. For feature comparison, a full link layer is added at the end. By utilizing the relationship between the image neighborhood structures, namely the relationship between the hash code B and the semantic similarity matrix S, the following objective function is proposed to learn the hash code so as to be as close to the original data distribution in the projection space as possible:
wherein L is the length of the hash code, S is a similarity matrix, E represents that the objective function optimizes the encoder, and the encoder E is used for generating the hash codeOptimization ofsSimilar images in the original space can be made to have similar hash codes when mapped to the hash space.
And step 3: in generator G, we input hash code B as "noise" to generate a new image. Specifically, the hash code B is input into a fully connected layer having a size of 8 × 8 × 256. Then 3 5 × 5 and 1 × 1 deconvolution layers were used. The number of kernels is 256, 128, 32 and 3, respectively. For the image I generated by the generatorGAnd an original image I, an objective function is proposed between the feature vectors. The objective function is defined as follows:
where Ψ (-) represents the convolution-activated feature vector, w and h represent the sizes of the corresponding features, and D represents the adjustment of the parameters in the arbiter. However, the generator generates a new image by using the hash code of the rotated image, so that a large gap exists between the feature vector of the new image and the original image in consideration of low-level semantic information in the image. Based on this problem, the image I after rotationRAn objective function is arranged between the feature vectors of the original image I and the feature vectors of the original image I, the objective function being to ensure that the feature vectors of the same image are as similar as possible irrespective of rotation, thereby reducing the new image IRAnd a rotated image I obtained from the encoderGThe feature vector of the original image is obtained from the discriminator. Therefore, this loss function is used to optimize the encoder and the discriminator. The set objective function is as follows:
wherein, IRThe image is rotated, and I is an original image; the method aims to ignore semantic information of a lower layer, so that feature loss of the lower layer is not consideredAnd (6) losing. lf=lf1+γlf2And gamma is a weight parameter.
And 4, step 4: the original picture I and the pseudo picture I are combinedGThe input to the discriminator is optimized by setting a discriminator D to judge the truth of the picture according to the structure of the counterstudy. Optimization to generate a competing network is a very small value gaming problem. Arbiter D consists of four convolutional layers with core numbers 32, 128, 256, followed by a fully-connected layer with size 1024, with eLU as the activation function. The generative model is essentially a maximum likelihood estimate that is used to generate a model of the particular distribution data. The function of the generated model is to capture the distribution of sample data and convert the distribution of the original input information into a sample with specified distribution through the transformation of parameters in the maximum likelihood estimation. Using a standard objective function for generating a countermeasure network, the formula is as follows:
wherein G represents adjusting parameters in the generator, D (-) represents the output of the last layer of the discriminator, and G (-) represents the output of the last layer of the generator; in order to make the effect of generating the antagonistic network more obvious, random noise is input into the generator, and the generator and the discriminator are optimized by using the following loss function:
where z is random noise. l. thed=ld′+ld”
The sum of the four parts of the loss function is set, and the two weights α and β of the loss function are set, and the overall loss function is as follows:
L=ls+αlf+βld (9)
and 5: we use (9) above to optimize our network and connect completely at the lastAdding a hash layer after layer connection to obtain our hash code requires learning parameter We Tθ, η, and ξ, the calculation is as follows:
whereinIndicates the input rotated picture, beta,And θ represents a parameter in the encoder network.
IG=G(bi;η) (11)
Where η represents a parameter in the generator G.
Where ξ represents a parameter in discriminator D; the objective function is optimized using back-propagation and random gradient descent algorithms.
Example 3:
experiments were performed on Cifar-10. Cifar-10 is a dataset compiled by Alex krizzesky and Ilya sutskver. It contains 60000 images (32 × 32)10 categories of 6000 pictures each. In Cifar-10, 1000 pictures are randomly drawn at each class as a training set and 100 pictures are drawn as a test set.
Evaluation indices mean of precision (mAP) and mean precision (AP) were used to evaluate our method. For each query, the average precision is the average of the top k results, the average precision is the average of all queries, and the calculation formula of the average precision is as follows:
where N represents the number of instances in the database used for the query that are associated with the ground truth. P (k) represents the precision of the first k instances. When the kth instance is relevant to the query, δ (k) is 1, otherwise δ (k) is 0.
Results on the cifar-10 dataset:
first, let K1 be 20 and K2 be 30 to obtain the semantic similarity matrix S. Fig. 2 shows a part of data in the matrix. Then, the minimum batch size was set to 256, and the learning rate was set to 0.0001. Setting alpha-1, beta-1 and gamma-3. The rotation angle is 90 degrees.
Table 1 shows the average precision mean results for 12, 24, 32 and 48 bits. The results show that the average precision mean results of 12 bits, 24 bits and 48 bits are respectively improved by 9.4%, 9.8% and 5.2%. This shows that the present invention has better performance on fewer bits, and the hash code provided by the present invention can better represent high-level semantic information of an image, thereby verifying the above inference. To further verify this idea, the image was rotated by 180 degrees again for the experiment, and the results are shown in table 3. As shown in table 3, the result of the 12-bit hash code is improved by 10.9%.
Training and testing times for Deep Hash (DH), binary countermeasure hash (BGAN), and self-supervised countermeasure hash (SHGan) were further compared. As shown in table 2. Binary hash (BGAN) and self-supervision countermeasure hash parameters are more, and training time and testing time are both longer than DH. However, binary hashing (BGAN) and self-supervised counterhashing (hch) generate hash codes very quickly due to the advantages of the hashing algorithm.
The influence of the parameter gamma on the experimental results was investigated. As shown in fig. 3, it was found that γ had little influence on the experimental results.
According to the pixel loss function in the binary countermeasure hash, a loss function at a pixel level is added to the self-supervised countermeasure hash. The formula is as follows:
Iijandrespectively representing the original image and the generated pseudo-image. Since the high number of bits is more representative of the pixel information in the image, the experimental results of the 32-bit and 48-bit hash codes are compared. The results are shown in FIG. 4. It can be seen that the mean of average accuracy (mAP) of this method decreases significantly with the loss of pixels, which further illustrates that our invention can enable neural networks to learn the high level semantic information of the image.
In summary, the present invention provides an auto-supervised hash algorithm based on a generative countermeasure network, which is called as an auto-supervised countermeasure hash algorithm. The self-supervision countermeasure hash is composed of an encoder, a generator and a discriminator. A loss function consisting of approximate semantic similarity loss, feature loss and antagonism loss is designed to maintain the similarity between the image and the hash code. The learned hash code can better represent high-level semantic information of the image, so that the accuracy of image retrieval is improved. Experimental results on the Cifar-10 dataset show that the proposed self-supervised counterhash has a higher performance. The invention provides a self-supervision learning method, which utilizes self-supervision information based on rotation or other transformation to design a target function; the generation of an antagonistic network is one of the most promising self-supervision learning methods; generating the countermeasure network can effectively generate synthetic data from the underlying space that is similar to the training data.
Claims (6)
1. A large-scale image retrieval method based on an automatic supervision countermeasure hash algorithm is characterized by comprising the following steps:
s1: acquiring image data comprising a training set and a test set;
s2: optimizing the encoder by utilizing the training set; the encoder optimization specifically includes:
s2-1: obtaining the feature vectors of the images of the training set, calculating the cosine distances between the images and sequencing to obtain a similarity ranking;
s2-2: analyzing the similarity ranking and setting a threshold value to obtain a similarity matrix;
s2-3: rotating the training set image and inputting the image into the encoder to obtain a hash code;
s2-4: inputting the hash code into a generator to obtain a pseudo image;
s2-5: inputting the pseudo image and the real image into a discriminator simultaneously for countermeasure training;
s2-6: optimizing the encoder, the generator and the discriminator according to an objective function; the optimized encoder, generator and discriminator form an automatic supervision countermeasure Hash algorithm;
s3: rotating the image data of the test set, and inputting the image data of the test set into an optimized S2 encoder to obtain a hash code;
s4: calculating the Hamming distance between the hash code obtained in the step S3 and the hash code of the training set S2, sorting the Hamming distances from small to large, outputting the first k retrieval results, and completing retrieval;
in the S2: the encoder uses a structure similar to VGG19, including five convolutional layers, two fully-connected layers, and one hash layer; adding a full connection layer; by utilizing the relationship between image neighborhood structures, namely the relationship between the hash code and the semantic similarity matrix, the following objective function is proposed to learn the hash code so as to approximate the original data distribution in the projection space:
2. The large-scale image retrieval method according to claim 1, wherein the S2-1 is specifically: for database pointsExtracting feature vectors from pool5 layer of VGG model by using k-nearest neighbor KNN methodAnd calculating cosine distances between the two groups, and sequencing the two groups in the order from small to large to obtain similarity ranking.
3. The large-scale image retrieval method according to claim 1, wherein in S2-2: setting the K1 range as its neighborhood according to the cosine similarity of each image, and obtaining the initial matrix S1,S1The calculation is as follows:
wherein x isiAnd xjFor the feature vector of the image, K1-NN is xiK1 nearest neighbors of (S)1On the basis of (1), comparing S1And corresponding column of (1) and useStructure S2As follows:
wherein K2-NN is xiK2 nearest neighbors and finally, combining these two matrices into S, is calculated as follows:
4. the large-scale image retrieval method according to claim 1, wherein in S2-4: the generator consists of a fully connected layerAnd four deconvolution layers, in the generator, inputting hash code as 'noise' to generate new image; specifically, the hash code is input into a fully concatenated layer of size 8 × 8 × 256, and then 3 deconvolution layers of 5 × 5 and 1 × 1 are used, the number of kernels being 256, 128, 32, and 3, respectively; for the image I generated by the generatorGAnd an original image I, and an objective function is provided among the characteristic vectors; the objective function is defined as follows:
where Ψ (-) denotes the convolution-activated feature vector, w and h represent the sizes of the corresponding features; the generator generates a new image by using the hash code of the rotated image, and the rotated image IRAn objective function is set between the feature vector of the original image I and the feature vector of the original image I, and the feature vector of the original image is obtained from a discriminator; using this loss function to optimize the encoder and the arbiter; the final set objective function is therefore as follows:
wherein lf=lf1+γlf2And gamma is a weight parameter.
5. The large-scale image retrieval method according to claim 1, wherein in S2-5: according to the structure of the counterstudy, a discriminator D is arranged to judge whether the image is true or false, so that the generator G is optimized; the discriminator D is composed of four convolution layers, the number of the cores is 32, 128, 256 and 256, a full connection layer is arranged next to the cores, the size of the full connection layer is 1024, and eLU is used as an activation function; using a standard objective function for generating a countermeasure network, the formula is as follows:
in order to make the effect of generating the antagonistic network more obvious, random noise is input into the generator, and the generator and the arbiter are optimized by using the following loss function:
where z is random noise, ld=ld'+ld”。
6. The large-scale image retrieval method according to claim 1, wherein in S2-6: the sum of the three parts of the loss function is set, and the two weights α and β of the loss function are set, the overall loss function, i.e. the objective function, is as follows:
L=ls+αlf+βld (9);
using the above (9) to perform network optimization, and adding a hash layer after the last complete connection layer to obtain hash code, learning parameters is requiredTheta, eta and xi, and the calculation process is as follows;
whereinA rotated picture representing the input is shown,and θ represents a parameter in the encoder network;
IG=G(bi;η) (11)
where η represents a parameter in the generator;
where ξ represents a parameter in the discriminator; the objective function is optimized using back-propagation and random gradient descent algorithms.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110531130.8A CN113191445B (en) | 2021-05-16 | 2021-05-16 | Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110531130.8A CN113191445B (en) | 2021-05-16 | 2021-05-16 | Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113191445A CN113191445A (en) | 2021-07-30 |
CN113191445B true CN113191445B (en) | 2022-07-19 |
Family
ID=76981846
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110531130.8A Active CN113191445B (en) | 2021-05-16 | 2021-05-16 | Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113191445B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113326390B (en) * | 2021-08-03 | 2021-11-02 | 中国海洋大学 | Image retrieval method based on depth feature consistent Hash algorithm |
CN113946710A (en) * | 2021-10-12 | 2022-01-18 | 浙江大学 | Video retrieval method based on multi-mode and self-supervision characterization learning |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063112A (en) * | 2018-07-30 | 2018-12-21 | 成都快眼科技有限公司 | A kind of fast image retrieval method based on multi-task learning deep semantic Hash, model and model building method |
CN109918528A (en) * | 2019-01-14 | 2019-06-21 | 北京工商大学 | A kind of compact Hash code learning method based on semanteme protection |
CN109960737A (en) * | 2019-03-15 | 2019-07-02 | 西安电子科技大学 | Remote Sensing Images search method of the semi-supervised depth confrontation from coding Hash study |
CN110110128A (en) * | 2019-05-06 | 2019-08-09 | 西南大学 | The discrete hashing image searching system of quickly supervision for distributed structure/architecture |
CN110516095A (en) * | 2019-08-12 | 2019-11-29 | 山东师范大学 | Weakly supervised depth Hash social activity image search method and system based on semanteme migration |
CN112214623A (en) * | 2020-09-09 | 2021-01-12 | 鲁东大学 | Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111597298A (en) * | 2020-03-26 | 2020-08-28 | 浙江工业大学 | Cross-modal retrieval method and device based on deep confrontation discrete hash learning |
CN112199520B (en) * | 2020-09-19 | 2022-07-22 | 复旦大学 | Cross-modal Hash retrieval algorithm based on fine-grained similarity matrix |
CN112214570A (en) * | 2020-09-23 | 2021-01-12 | 浙江工业大学 | Cross-modal retrieval method and device based on counterprojection learning hash |
-
2021
- 2021-05-16 CN CN202110531130.8A patent/CN113191445B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063112A (en) * | 2018-07-30 | 2018-12-21 | 成都快眼科技有限公司 | A kind of fast image retrieval method based on multi-task learning deep semantic Hash, model and model building method |
CN109918528A (en) * | 2019-01-14 | 2019-06-21 | 北京工商大学 | A kind of compact Hash code learning method based on semanteme protection |
CN109960737A (en) * | 2019-03-15 | 2019-07-02 | 西安电子科技大学 | Remote Sensing Images search method of the semi-supervised depth confrontation from coding Hash study |
CN110110128A (en) * | 2019-05-06 | 2019-08-09 | 西南大学 | The discrete hashing image searching system of quickly supervision for distributed structure/architecture |
CN110516095A (en) * | 2019-08-12 | 2019-11-29 | 山东师范大学 | Weakly supervised depth Hash social activity image search method and system based on semanteme migration |
CN112214623A (en) * | 2020-09-09 | 2021-01-12 | 鲁东大学 | Image-text sample-oriented efficient supervised image embedding cross-media Hash retrieval method |
Non-Patent Citations (2)
Title |
---|
"Learning to Hash with Dimension Analysis based Quantizer for Image Retrieval";Yuan Cao et al.;《IEEE》;20201231;第1-12页 * |
"适用于图像检索的强化对抗生成哈希方法";施鸿源 等;《小型微型计算机系统》;20210507;第42卷(第5期);第1039-1043页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113191445A (en) | 2021-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107480261B (en) | Fine-grained face image fast retrieval method based on deep learning | |
CN113190699B (en) | Remote sensing image retrieval method and device based on category-level semantic hash | |
CN111291836B (en) | Method for generating student network model | |
CN107122809B (en) | Neural network feature learning method based on image self-coding | |
CN113191445B (en) | Large-scale image retrieval method based on self-supervision countermeasure Hash algorithm | |
CN113657561B (en) | Semi-supervised night image classification method based on multi-task decoupling learning | |
CN111898689A (en) | Image classification method based on neural network architecture search | |
CN108984642A (en) | A kind of PRINTED FABRIC image search method based on Hash coding | |
CN109960732B (en) | Deep discrete hash cross-modal retrieval method and system based on robust supervision | |
CN112686376A (en) | Node representation method based on timing diagram neural network and incremental learning method | |
CN114092747A (en) | Small sample image classification method based on depth element metric model mutual learning | |
CN111008224A (en) | Time sequence classification and retrieval method based on deep multitask representation learning | |
CN111783688B (en) | Remote sensing image scene classification method based on convolutional neural network | |
Bai et al. | Learning high-level image representation for image retrieval via multi-task dnn using clickthrough data | |
CN111079840B (en) | Complete image semantic annotation method based on convolutional neural network and concept lattice | |
CN112699782A (en) | Radar HRRP target identification method based on N2N and Bert | |
CN116977725A (en) | Abnormal behavior identification method and device based on improved convolutional neural network | |
CN111507472A (en) | Precision estimation parameter searching method based on importance pruning | |
CN116543250A (en) | Model compression method based on class attention transmission | |
CN114168782B (en) | Deep hash image retrieval method based on triplet network | |
CN112446432B (en) | Handwriting picture classification method based on quantum self-learning self-training network | |
CN115131605A (en) | Structure perception graph comparison learning method based on self-adaptive sub-graph | |
CN114387524A (en) | Image identification method and system for small sample learning based on multilevel second-order representation | |
CN113887653A (en) | Positioning method and system for tightly-coupled weak supervised learning based on ternary network | |
CN114170426A (en) | Algorithm model for classifying rare tumor category small samples based on cost sensitivity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |